pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 00:20:18 +01:00

History

PyTorch MergeBot 93a70c717a Revert "Add CUDA MXFP4 scaled mm support via. FBGEMM (#166526 )" This reverts commit `e3ae0594d1`. Reverted https://github.com/pytorch/pytorch/pull/166526 on behalf of https://github.com/atalman due to Failing internal test ([comment](https://github.com/pytorch/pytorch/pull/166526#issuecomment-3474907536))		2025-10-31 21:10:28 +00:00
..
cutlass_extensions	Clean up of CUTLASS_VERSION (#152947 )	2025-05-08 08:32:34 +00:00
linalg	[1/N][Fix] Fix typo in aten folder (#166126 )	2025-10-27 15:34:39 +00:00
AbsKernel.cu
Activation.cpp	c10::string_view -> std::string_view in aten (#141903 )	2024-12-07 23:23:52 +00:00
Activation.h	Modernize C++ code in aten/src/ATen/ (#141424 )	2024-11-24 02:15:19 +00:00
ActivationEluKernel.cu
ActivationGeluKernel.cu
ActivationGluKernel.cu
ActivationHardshrinkKernel.cu
ActivationHardsigmoidKernel.cu	[ROCm] fix hardsigmoid op (#162758 )	2025-09-12 15:07:13 +00:00
ActivationHardswishKernel.cu	Fix `torch.nn.functional.hardswish` gradients corner case (#148049 )	2025-03-14 18:53:10 +00:00
ActivationHardtanhKernel.cu
ActivationLeakyReluKernel.cu
ActivationLogSigmoidKernel.cu
ActivationMishKernel.cu
ActivationPreluKernel.cu
ActivationSiluKernel.cu
ActivationSoftplusKernel.cu
ActivationSoftshrinkKernel.cu	softshrink nan fixes (#138421 )	2024-11-21 23:06:08 +00:00
ActivationThresholdKernel.cu
AdaptiveAveragePooling.cu	[Doc fix] fix spelling of enough (#159587 )	2025-08-01 01:50:57 +00:00
AdaptiveAveragePooling3d.cu	Fix incorrect stride handling in adaptive_avg_pool3d (#157326 )	2025-07-01 03:03:48 +00:00
AdaptiveMaxPooling2d.cu
AdaptiveMaxPooling3d.cu
airy_ai.cu
AmpKernels.cu	Fix broken URLs (#152237 )	2025-04-27 09:56:42 +00:00
AveragePool2d.cu	[CUDA][avgpool2d] Fix backward launch bounds again for `sm100`, `sm120` (#150640 )	2025-04-04 13:05:40 +00:00
AveragePool3d.cu
bessel_j1.cu
bessel_j0.cu
bessel_y1.cu
bessel_y0.cu
BinaryBitwiseOpsKernels.cu
BinaryDivFloorKernel.cu
BinaryDivTrueKernel.cu
BinaryDivTruncKernel.cu
BinaryGeometricKernels.cu
BinaryInternal.h
BinaryLogicalOpsKernels.cu
BinaryMiscBackwardOpsKernels.cu
BinaryMiscOpsKernels.cu
BinaryMulKernel.cu
BinaryRemainderKernel.cu
BinaryShiftOpsKernels.cu
Blas.cpp	[CUDA][cuBLASLt] addmm -- extend bias fusions to cases with (1 by n) shapes (#166307 )	2025-10-31 14:30:41 +00:00
block_reduce.cuh	[ROCm] Remove use of `warpsize` on host-side compilation (#156979 )	2025-07-01 04:55:31 +00:00
Bucketization.cu	[CUDA][64-bit indexing] Fix some existing problematic `int64_t _ = blockIdx.* * blockDim.*` code (#142010 )	2024-12-19 00:55:11 +00:00
chebyshev_polynomial_t.cu
chebyshev_polynomial_u.cu
chebyshev_polynomial_v.cu
chebyshev_polynomial_w.cu
Col2Im.cu
CompareEQKernel.cu
CompareKernels.cu
ComplexKernel.cu
CompositeRandomAccessor.h
ConvolutionMM2d.cu
Copy.cu	[PyTorch] Use events from pool in copy_device_to_device (#165647 )	2025-10-28 05:19:05 +00:00
Copy.h
CopysignKernel.cu
CrossKernel.cu
cuBlasCommonArgs.h	[1/2] Split `cublasCommonArgs` into its own file (#166313 )	2025-10-28 16:35:32 +00:00
CUDAJitLoops.cuh	[ATen][CUDA] Implement 128 bit vectorization v2 (#145746 )	2025-01-31 06:42:08 +00:00
CUDALoops.cuh	Update workaround to old CUDA bug (#164354 ) (#165984 )	2025-10-21 19:09:43 +00:00
CUDAScalar.cu	[ROCm] delete un-needed workaround for tensor.item() (#158486 )	2025-07-23 00:31:57 +00:00
CuFFTPlanCache.h	[Doc fix] fix spelling of enough (#159587 )	2025-08-01 01:50:57 +00:00
CuFFTUtils.h	[ATen][CUDA][cuFFT] Guard against deprecated error codes (#159466 )	2025-07-30 21:10:32 +00:00
CumminmaxKernel.cu
CumprodKernel.cu
CumsumKernel.cu
cutlass_common.cuh	[CUTLASS] [CUDA] SM100 GroupMM (#156203 )	2025-06-28 23:02:00 +00:00
DepthwiseConv2d.cu	Work around buggy use_const_ref_for_mutable_tensors (#145530 )	2025-01-24 14:38:49 +00:00
DepthwiseConv3d.cu
DeviceSqrt.cuh
DilatedMaxPool2d.cu	Turn some const variables into constexpr in C++ code (#165401 )	2025-10-17 13:24:46 +00:00
DilatedMaxPool3d.cu
DistanceKernel.cu	[CUDA][64-bit indexing] Fix some existing problematic `int64_t _ = blockIdx.* * blockDim.*` code (#142010 )	2024-12-19 00:55:11 +00:00
DistributionBernoulli.cu
DistributionCauchyKernel.cu
DistributionExponentialKernel.cu
DistributionGeometricKernel.cu
DistributionLogNormalKernel.cu
DistributionNormal.cu
DistributionRandomKernel.cu
Distributions.cpp
Distributions.cu
Distributions.h
DistributionTemplates.h	[1/N][Fix] Fix typo in aten folder (#166126 )	2025-10-27 15:34:39 +00:00
DistributionUniform.cu
Dropout.cu	[ATen][CUDA] Implement 128 bit vectorization v2 (#145746 )	2025-01-31 06:42:08 +00:00
Embedding.cu	Remove CUDA 11 workarounds for CUB_SUPPORTS_SCAN_BY_KEY and CUB_SUPPORTS_UNIQUE_BY_KEY (#164637 )	2025-10-18 20:05:54 +00:00
EmbeddingBackwardKernel.cu	Remove CUDA 11 workarounds for CUB_SUPPORTS_SCAN_BY_KEY and CUB_SUPPORTS_UNIQUE_BY_KEY (#164637 )	2025-10-18 20:05:54 +00:00
EmbeddingBackwardKernel.cuh
EmbeddingBag.cu	Remove CUDA 11 workarounds for CUB_SUPPORTS_SCAN_BY_KEY and CUB_SUPPORTS_UNIQUE_BY_KEY (#164637 )	2025-10-18 20:05:54 +00:00
Equal.cpp
FillKernel.cu
FlattenIndicesKernel.cu
ForeachBinaryOpList.cu	[BE][Ez]: Prevent copies of std::vector in CUDA ForeachOps (#163416 )	2025-09-21 05:24:13 +00:00
ForeachBinaryOpScalar.cu	[BE][Ez]: Prevent copies of std::vector in CUDA ForeachOps (#163416 )	2025-09-21 05:24:13 +00:00
ForeachBinaryOpScalarList.cu	[BE][Ez]: Prevent copies of std::vector in CUDA ForeachOps (#163416 )	2025-09-21 05:24:13 +00:00
ForeachBinaryOpScalarTensor.cu	[BE][Ez]: Prevent copies of std::vector in CUDA ForeachOps (#163416 )	2025-09-21 05:24:13 +00:00
ForeachFunctors.cuh	Revert "handling special case for pow(3) for GPU (#157537 )"	2025-08-19 22:57:45 +00:00
ForeachMinMaxFunctors.cuh
ForeachPointwiseOp.cu	[BE][Ez]: Prevent copies of std::vector in CUDA ForeachOps (#163416 )	2025-09-21 05:24:13 +00:00
ForeachReduceOp.cu	chunk_size should always be int64_t for Foreach functors (#156872 )	2025-06-27 22:35:34 +00:00
ForeachTernaryOp.cu	[BE][Ez]: Prevent copies of std::vector in CUDA ForeachOps (#163416 )	2025-09-21 05:24:13 +00:00
ForeachUnaryOp.cu	[BE][Ez]: Prevent copies of std::vector in CUDA ForeachOps (#163416 )	2025-09-21 05:24:13 +00:00
FractionalMaxPool2d.cu
FractionalMaxPool3d.cu	[CUDA][64-bit indexing] Fix some existing problematic `int64_t _ = blockIdx.* * blockDim.*` code (#142010 )	2024-12-19 00:55:11 +00:00
FunctionOfAMatrixUtilsKernel.cu
fused_adagrad_impl.cu	Split out C++ code from fused adagrad PR (#159008 )	2025-07-26 00:36:59 +00:00
fused_adagrad_impl.cuh	Split out C++ code from fused adagrad PR (#159008 )	2025-07-26 00:36:59 +00:00
fused_adagrad_utils.cuh	[BugFix] chunk_size should always be int64_t (#165971 )	2025-10-21 19:52:47 +00:00
fused_adam_amsgrad_impl.cu
fused_adam_amsgrad_impl.cuh
fused_adam_impl.cu
fused_adam_impl.cuh
fused_adam_utils.cuh	chunk_size should always be int64_t for Foreach functors (#156872 )	2025-06-27 22:35:34 +00:00
fused_adamw_amsgrad_impl.cu
fused_adamw_amsgrad_impl.cuh
fused_adamw_impl.cu
fused_adamw_impl.cuh
FusedAdagradKernel.cu	Split out C++ code from fused adagrad PR (#159008 )	2025-07-26 00:36:59 +00:00
FusedAdamKernel.cu	[5/N] Apply bugprone-unchecked-optional-access (#143111 )	2024-12-15 01:07:28 +00:00
FusedAdamWKernel.cu	[5/N] Apply bugprone-unchecked-optional-access (#143111 )	2024-12-15 01:07:28 +00:00
FusedSgdKernel.cu	chunk_size should always be int64_t for Foreach functors (#156872 )	2025-06-27 22:35:34 +00:00
GcdLcmKernel.cu
GridSampler.cpp
GridSampler.cu
GridSampler.cuh
GridSampler.h	Modernize C++ code in aten/src/ATen/ (#141424 )	2024-11-24 02:15:19 +00:00
group_norm_kernel.cu	[CUDA][64-bit indexing] Fix some existing problematic `int64_t _ = blockIdx.* * blockDim.*` code (#142010 )	2024-12-19 00:55:11 +00:00
GroupedBlas.cpp	Add MXFP4 grouped gemm support via. FBGEMM kernels (#166530 )	2025-10-30 16:46:11 +00:00
GroupMM.cu	improve shape checks for grouped_mm (#159666 )	2025-08-02 00:12:25 +00:00
GroupMM.h	bf16 grouped gemm (#150374 )	2025-04-06 04:53:24 +00:00
GroupMMCommon.cuh	improve shape checks for grouped_mm (#159666 )	2025-08-02 00:12:25 +00:00
hermite_polynomial_h.cu
hermite_polynomial_he.cu
IGammaKernel.cu	Turn some const variables into constexpr in C++ code (#165401 )	2025-10-17 13:24:46 +00:00
Im2Col.cu
im2col.cuh	[BE] Remove unusued `channels` arg in col2im (#142336 )	2024-12-09 01:49:41 +00:00
Indexing.cu	[ROCm] Adjust grid size for non-unit stride backwards indexing (#165026 )	2025-10-10 16:36:38 +00:00
IndexKernel.cpp	[4/N] Avoid copy in std::get (#142285 )	2024-12-09 07:59:35 +00:00
IndexKernel.cu	[CUDA] fix indexing on large tensor causing nvalid configuration argument (#164049 )	2025-09-29 06:07:35 +00:00
IndexKernel.h	Modernize C++ code in aten/src/ATen/ (#141424 )	2024-11-24 02:15:19 +00:00
IndexKernelUtils.cu	Add a compile-time flag to trigger verbose logging for device-side asserts (#166171 )	2025-10-30 19:43:46 +00:00
IndexKernelUtils.h	Support more dtypes for input, indices in gather (#151822 )	2025-05-01 16:35:23 +00:00
int4mm.cu	Remove old ROCm version checks and branches (#166111 )	2025-10-27 05:32:54 +00:00
int8mm.cu	[WOQ] Integrate CUDA support for int8pack_mm woq optimization pattern (#161680 )	2025-09-17 10:24:13 +00:00
jit_utils.cpp	[1/N][Fix] Fix typo in aten folder (#166126 )	2025-10-27 15:34:39 +00:00
jit_utils.h	add the `torch.float8_e8m0fnu` dtype to PyTorch (#147466 )	2025-02-20 13:55:42 +00:00
JitLoops.cuh
KernelUtils.cuh	Remove old ROCm version checks and branches (#166111 )	2025-10-27 05:32:54 +00:00
laguerre_polynomial_l.cu
LaunchUtils.h
layer_norm_kernel.cu	[ROCm] Disable `__builtin_amdgcn_rcpf` for gfx90a (#166454 )	2025-10-30 23:39:00 +00:00
legendre_polynomial_p.cu
Lerp.cu	Fix `torch.lerp` RuntimeError when `weight` is CPU scalar while `input` & `end` are CUDA tensor (#141820 )	2024-12-09 18:14:54 +00:00
LinearAlgebra.cu
LinearAlgebraStubs.cpp	[1/N][Fix] Fix typo in aten folder (#166126 )	2025-10-27 15:34:39 +00:00
LogAddExpKernel.cu
LogcumsumexpKernel.cu	Remove old workaround in launch_logcumsumexp_cuda_kernel (#164567 )	2025-10-03 18:07:02 +00:00
Loops.cuh	Simplify c10::guts::apply (#164566 )	2025-10-22 00:47:43 +00:00
Loss.cu	Removed ROCM ifdef that governs thread count + smem parallel reduction. (#149779 )	2025-03-29 04:27:54 +00:00
LossCTC.cu	[CUDA] Decrease launch bounds of CTCLoss backward for blackwell (#159522 )	2025-08-05 19:26:25 +00:00
Math.cuh	Turn some const variables into constexpr in C++ code (#165401 )	2025-10-17 13:24:46 +00:00
MaxMinElementwiseKernel.cu
MaxUnpooling.cu	[BUG] MaxUnpool2d/3d should check output dim before accessing its elements (#163507 )	2025-09-22 21:36:48 +00:00
MemoryAccess.cuh	[ROCm] Improve vectorized elementwise kernel performance in MI300X (#153634 )	2025-05-27 20:49:32 +00:00
MiscUtils.h	Enable modernize-use-default-member-init (#149046 )	2025-04-09 11:57:24 +00:00
MixedDtypesLinear.cu	Remove outdated CUDA 11 conditions (#154313 )	2025-05-28 08:44:58 +00:00
modified_bessel_i1.cu
modified_bessel_i0.cu
modified_bessel_k1.cu
modified_bessel_k0.cu
MultiLabelMarginCriterion.cu
MultiMarginLoss.cu	[CUDA] Fix missing `__syncthreads` in MultiMarginLoss backward (#158994 )	2025-07-24 20:47:29 +00:00
MultinomialKernel.cu	[ROCm] Remove use of `warpsize` on host-side compilation (#156979 )	2025-07-01 04:55:31 +00:00
MultiTensorApply.cuh
NaiveConvolutionTranspose2d.cu
NaiveConvolutionTranspose3d.cu
NaiveDilatedConvolution.cu
NLLLoss2d.cu	[cuda] fix nll_loss2d backward bounds check with reduction=none (#165247 )	2025-10-20 06:25:11 +00:00
Nonzero.cu	Remove C++ and test branches for CUDA<12 (#163443 )	2025-09-22 18:20:08 +00:00
Normalization.cu	Add assertion to align with cuda (#153233 )	2025-05-23 07:32:43 +00:00
Normalization.cuh	Use std::min for #166021 (#166195 )	2025-10-27 17:57:44 +00:00
PersistentSoftmax.cuh	Improve softmax's perf in cuda (#144679 )	2025-01-23 00:02:57 +00:00
PointwiseOpsKernel.cu	Remove outdated CUDA 11 conditions (#154313 )	2025-05-28 08:44:58 +00:00
Pow.cuh	Workaround ATen SFINAE under libc++ (#161101 )	2025-08-21 00:55:58 +00:00
PowKernel.cu	Revert "handling special case for pow(3) for GPU (#157537 )"	2025-08-19 22:57:45 +00:00
Randperm.cu
Randperm.cuh	[4/N] Avoid copy in std::get (#142285 )	2024-12-09 07:59:35 +00:00
RangeFactories.cu	[CUDA][MPS] Fix torch.arange bound validation for large float inputs (#154320 )	2025-06-05 14:51:25 +00:00
RecordStream.cu
Reduce.cu
Reduce.cuh	[ATen] Fix CUDA reduction warp shuffle order (#164790 )	2025-10-21 00:09:13 +00:00
ReduceAMinMaxKernel.cu
ReduceArgMaxKernel.cu
ReduceArgMinKernel.cu
ReduceLogicKernel.cu
ReduceMaxValuesKernel.cu
ReduceMinValuesKernel.cu
ReduceMomentKernel.cu	[ATen] Vectorize 8 elements on 16 bit data types for sum/mean (#165055 )	2025-10-17 13:39:36 +00:00
ReduceNormKernel.cu
ReduceOps.cpp
ReduceOps.h	Modernize C++ code in aten/src/ATen/ (#141424 )	2024-11-24 02:15:19 +00:00
ReduceSumProdKernel.cu	[ATen] Vectorize 8 elements on 16 bit data types for sum/mean (#165055 )	2025-10-17 13:39:36 +00:00
reduction_template.cuh	[ATen] Fix CUDA reduction warp shuffle order (#164790 )	2025-10-21 00:09:13 +00:00
ReflectionPad.cu	[CUDA] fix reflection padding for large batch size (#165942 )	2025-10-21 21:07:38 +00:00
RenormKernel.cu
Repeat.cu	Add `CUDA_KERNEL_ASSERT_PRINTF`, a more flexible `CUDA_KERNEL_ASSERT_MSG` (#160129 )	2025-09-16 00:23:48 +00:00
ReplicationPadding.cu	[CUDA][64-bit indexing] Fix some existing problematic `int64_t _ = blockIdx.* * blockDim.*` code (#142010 )	2024-12-19 00:55:11 +00:00
Resize.cpp
Resize.h	Enable modernize-use-default-member-init (#149046 )	2025-04-09 11:57:24 +00:00
RNN.cu	[ROCm] missing AT_CUDA_CHECK for cub and SoftMax (#149883 )	2025-03-25 23:22:32 +00:00
RowwiseScaledMM.cu	[cutlass] Prep for cutlass upgrade by ignoring Wunused-but-set-variable (#159276 )	2025-07-29 04:40:24 +00:00
RowwiseScaledMM.h
RreluWithNoise.cu	[4/N] Avoid copy in std::get (#142285 )	2024-12-09 07:59:35 +00:00
scaled_modified_bessel_k1.cu
scaled_modified_bessel_k0.cu
ScaledBlas.cpp	Revert "Add CUDA MXFP4 scaled mm support via. FBGEMM (#166526 )"	2025-10-31 21:10:28 +00:00
ScaledGroupMM.cu	improve shape checks for grouped_mm (#159666 )	2025-08-02 00:12:25 +00:00
ScaledGroupMM.h	[WIP] Initial implementation of Grouped Gemm API (#148531 )	2025-03-11 21:49:46 +00:00
ScanKernels.cpp	Implement deterministic scan (#140887 )	2024-11-19 23:43:26 +00:00
ScanKernels.h
ScanUtils.cuh	Remove outdated CUDA 11 conditions (#154313 )	2025-05-28 08:44:58 +00:00
ScatterGatherKernel.cu	Add a compile-time flag to trigger verbose logging for device-side asserts (#166171 )	2025-10-30 19:43:46 +00:00
SegmentReduce.cu	[CD] Add CUDA 13.0 Windows build (#161663 )	2025-09-01 15:27:17 +00:00
Shape.cu	Fix: nDims is mutated inside the loop in Shape.cu (#165446 )	2025-10-15 02:32:15 +00:00
shifted_chebyshev_polynomial_t.cu
shifted_chebyshev_polynomial_u.cu
shifted_chebyshev_polynomial_v.cu
shifted_chebyshev_polynomial_w.cu
SoftMax.cu	[ROCm] Remove use of `warpsize` on host-side compilation (#156979 )	2025-07-01 04:55:31 +00:00
Sort.cpp	[ROCm] Fix sort for non-standard bool (#147459 )	2025-03-06 00:23:02 +00:00
Sort.cu	[ROCm] Use IPT=8 for block radix sort (#147657 )	2025-02-26 04:22:16 +00:00
Sort.h
SortImpl.cu
Sorting.cpp	Fix race condition and make CUDA kthvalue deterministic (#165762 )	2025-10-25 00:45:57 +00:00
Sorting.cu	Fix race condition and make CUDA kthvalue deterministic (#165762 )	2025-10-25 00:45:57 +00:00
Sorting.h	Modernize C++ code in aten/src/ATen/ (#141424 )	2024-11-24 02:15:19 +00:00
SortingCommon.cuh	Recover non-standard bool test for msort (#139870 )	2024-11-11 02:00:34 +00:00
SortingRadixSelect.cuh
SortStable.cu	Allow at::native::offset_t to be offset using `operator+=` (#164570 )	2025-10-15 01:40:54 +00:00
SortStable.h
SortUtils.cuh
SparseBinaryOpIntersectionKernel.cu
SparseMM.cu
SpectralOps.cpp	Remove unnecessary "static" for definitions in anonymous namespace (#165035 )	2025-10-11 00:04:23 +00:00
SpectralOps.cu	[1/N] Remove inclusion of ATen/core/Array.h (#122064 )	2024-11-18 08:50:28 +00:00
spherical_bessel_j0.cu
StepKernel.cu
SummaryOps.cu	Non-deterministic alert in histc_cuda for floating types only (#151701 )	2025-04-24 21:16:46 +00:00
TensorCompare.cpp
TensorCompare.cu	Add FP8 support for eye (#139974 )	2024-12-24 10:00:23 +00:00
TensorFactories.cu	[CUDA][64-bit indexing] Fix some existing problematic `int64_t _ = blockIdx.* * blockDim.*` code (#142010 )	2024-12-19 00:55:11 +00:00
TensorModeKernel.cpp
TensorModeKernel.cu	[ROCm] Remove use of `warpsize` on host-side compilation (#156979 )	2025-07-01 04:55:31 +00:00
TensorModeKernel.cuh	Remove outdated CUDA 11 conditions (#154313 )	2025-05-28 08:44:58 +00:00
TensorModeKernel.h	Modernize C++ code in aten/src/ATen/ (#141424 )	2024-11-24 02:15:19 +00:00
TensorShape.cu	Make torch._chunk_cat support non-contiguous inputs (#151263 )	2025-04-16 04:18:46 +00:00
TensorShapeCUDA.cpp
TensorTopK.cpp	Remove CUDA 11 workarounds for CUB_SUPPORTS_SCAN_BY_KEY and CUB_SUPPORTS_UNIQUE_BY_KEY (#164637 )	2025-10-18 20:05:54 +00:00
TensorTopK.cu	Remove CUDA 11 workarounds for CUB_SUPPORTS_SCAN_BY_KEY and CUB_SUPPORTS_UNIQUE_BY_KEY (#164637 )	2025-10-18 20:05:54 +00:00
TensorTopK.h	Modernize C++ code in aten/src/ATen/ (#141424 )	2024-11-24 02:15:19 +00:00
TensorTransformations.cu	[CUDA][64-bit indexing] Fix some existing problematic `int64_t _ = blockIdx.* * blockDim.*` code (#142010 )	2024-12-19 00:55:11 +00:00
thread_constants.h	Revert "[CUDA] Only use vec128 if CUDA version is newer than 12.8 (#150705 )"	2025-04-08 16:29:05 +00:00
TriangularOps.cu	[cuda] fix triu/tril int32 overflow for large matrices (#164705 )	2025-10-20 07:17:41 +00:00
UnaryComplexKernels.cu
UnaryFractionKernels.cu
UnaryGammaKernels.cu
UnaryGeometricAcoshKernel.cu
UnaryGeometricAcosKernel.cu
UnaryGeometricAsinhKernel.cu
UnaryGeometricAsinKernel.cu
UnaryGeometricAtanhKernel.cu
UnaryGeometricAtanKernel.cu
UnaryGeometricCoshKernel.cu
UnaryGeometricCosKernel.cu
UnaryGeometricSinhKernel.cu
UnaryGeometricSinKernel.cu
UnaryGeometricTanhKernel.cu	disable jiterator for complex tan and tanh (#165250 )	2025-10-29 04:59:01 +00:00
UnaryGeometricTanKernel.cu	disable jiterator for complex tan and tanh (#165250 )	2025-10-29 04:59:01 +00:00
UnaryLogKernels.cu
UnaryOpsKernel.cu
UnarySignKernels.cu
UnarySpecialOpsKernel.cu
UnfoldBackwardKernel.cu
Unique.cu
UniqueCub.cu	[ATen][CUDA][CUB] Implement changes to CCCL (CUB/Thrust/LibCUDACXX) usage in ATen (#153373 )	2025-06-28 05:44:52 +00:00
UniqueCub.cuh
UpSample.cuh	Turn some const variables into constexpr in C++ code (#165401 )	2025-10-17 13:24:46 +00:00
UpSampleBicubic2d.cu
UpSampleBilinear2d.cu	[ROCm] new implementation of upsample_bilinear2d_backward (#164572 )	2025-10-25 02:39:24 +00:00
UpSampleLinear1d.cu
UpSampleNearest1d.cu
UpSampleNearest2d.cu	[64-bit][CUDA] Upsample2D 64-bit indexing fix attempt 2 (#141923 )	2025-01-04 02:30:38 +00:00
UpSampleNearest3d.cu	[64-bit] Int64 casting for UpSampleNearest3D (#144865 )	2025-01-29 19:30:09 +00:00
UpSampleTrilinear3d.cu
ValidateCompressedIndicesKernel.cu
vol2col.cuh
WeightNorm.cu
ZetaKernel.cu