pytorch/aten/src/ATen/native/cuda
2025-10-31 21:10:28 +00:00
..
cutlass_extensions Clean up of CUTLASS_VERSION (#152947) 2025-05-08 08:32:34 +00:00
linalg [1/N][Fix] Fix typo in aten folder (#166126) 2025-10-27 15:34:39 +00:00
AbsKernel.cu
Activation.cpp c10::string_view -> std::string_view in aten (#141903) 2024-12-07 23:23:52 +00:00
Activation.h Modernize C++ code in aten/src/ATen/ (#141424) 2024-11-24 02:15:19 +00:00
ActivationEluKernel.cu
ActivationGeluKernel.cu
ActivationGluKernel.cu
ActivationHardshrinkKernel.cu
ActivationHardsigmoidKernel.cu [ROCm] fix hardsigmoid op (#162758) 2025-09-12 15:07:13 +00:00
ActivationHardswishKernel.cu Fix torch.nn.functional.hardswish gradients corner case (#148049) 2025-03-14 18:53:10 +00:00
ActivationHardtanhKernel.cu
ActivationLeakyReluKernel.cu
ActivationLogSigmoidKernel.cu
ActivationMishKernel.cu
ActivationPreluKernel.cu
ActivationSiluKernel.cu
ActivationSoftplusKernel.cu
ActivationSoftshrinkKernel.cu softshrink nan fixes (#138421) 2024-11-21 23:06:08 +00:00
ActivationThresholdKernel.cu
AdaptiveAveragePooling.cu [Doc fix] fix spelling of enough (#159587) 2025-08-01 01:50:57 +00:00
AdaptiveAveragePooling3d.cu Fix incorrect stride handling in adaptive_avg_pool3d (#157326) 2025-07-01 03:03:48 +00:00
AdaptiveMaxPooling2d.cu
AdaptiveMaxPooling3d.cu
airy_ai.cu
AmpKernels.cu Fix broken URLs (#152237) 2025-04-27 09:56:42 +00:00
AveragePool2d.cu [CUDA][avgpool2d] Fix backward launch bounds again for sm100, sm120 (#150640) 2025-04-04 13:05:40 +00:00
AveragePool3d.cu
bessel_j1.cu
bessel_j0.cu
bessel_y1.cu
bessel_y0.cu
BinaryBitwiseOpsKernels.cu
BinaryDivFloorKernel.cu
BinaryDivTrueKernel.cu
BinaryDivTruncKernel.cu
BinaryGeometricKernels.cu
BinaryInternal.h
BinaryLogicalOpsKernels.cu
BinaryMiscBackwardOpsKernels.cu
BinaryMiscOpsKernels.cu
BinaryMulKernel.cu
BinaryRemainderKernel.cu
BinaryShiftOpsKernels.cu
Blas.cpp [CUDA][cuBLASLt] addmm -- extend bias fusions to cases with (1 by n) shapes (#166307) 2025-10-31 14:30:41 +00:00
block_reduce.cuh [ROCm] Remove use of warpsize on host-side compilation (#156979) 2025-07-01 04:55:31 +00:00
Bucketization.cu [CUDA][64-bit indexing] Fix some existing problematic int64_t _ = blockIdx.* * blockDim.* code (#142010) 2024-12-19 00:55:11 +00:00
chebyshev_polynomial_t.cu
chebyshev_polynomial_u.cu
chebyshev_polynomial_v.cu
chebyshev_polynomial_w.cu
Col2Im.cu
CompareEQKernel.cu
CompareKernels.cu
ComplexKernel.cu
CompositeRandomAccessor.h
ConvolutionMM2d.cu
Copy.cu [PyTorch] Use events from pool in copy_device_to_device (#165647) 2025-10-28 05:19:05 +00:00
Copy.h
CopysignKernel.cu
CrossKernel.cu
cuBlasCommonArgs.h [1/2] Split cublasCommonArgs into its own file (#166313) 2025-10-28 16:35:32 +00:00
CUDAJitLoops.cuh [ATen][CUDA] Implement 128 bit vectorization v2 (#145746) 2025-01-31 06:42:08 +00:00
CUDALoops.cuh Update workaround to old CUDA bug (#164354) (#165984) 2025-10-21 19:09:43 +00:00
CUDAScalar.cu [ROCm] delete un-needed workaround for tensor.item() (#158486) 2025-07-23 00:31:57 +00:00
CuFFTPlanCache.h [Doc fix] fix spelling of enough (#159587) 2025-08-01 01:50:57 +00:00
CuFFTUtils.h [ATen][CUDA][cuFFT] Guard against deprecated error codes (#159466) 2025-07-30 21:10:32 +00:00
CumminmaxKernel.cu
CumprodKernel.cu
CumsumKernel.cu
cutlass_common.cuh [CUTLASS] [CUDA] SM100 GroupMM (#156203) 2025-06-28 23:02:00 +00:00
DepthwiseConv2d.cu Work around buggy use_const_ref_for_mutable_tensors (#145530) 2025-01-24 14:38:49 +00:00
DepthwiseConv3d.cu
DeviceSqrt.cuh
DilatedMaxPool2d.cu Turn some const variables into constexpr in C++ code (#165401) 2025-10-17 13:24:46 +00:00
DilatedMaxPool3d.cu
DistanceKernel.cu [CUDA][64-bit indexing] Fix some existing problematic int64_t _ = blockIdx.* * blockDim.* code (#142010) 2024-12-19 00:55:11 +00:00
DistributionBernoulli.cu
DistributionCauchyKernel.cu
DistributionExponentialKernel.cu
DistributionGeometricKernel.cu
DistributionLogNormalKernel.cu
DistributionNormal.cu
DistributionRandomKernel.cu
Distributions.cpp
Distributions.cu
Distributions.h
DistributionTemplates.h [1/N][Fix] Fix typo in aten folder (#166126) 2025-10-27 15:34:39 +00:00
DistributionUniform.cu
Dropout.cu [ATen][CUDA] Implement 128 bit vectorization v2 (#145746) 2025-01-31 06:42:08 +00:00
Embedding.cu Remove CUDA 11 workarounds for CUB_SUPPORTS_SCAN_BY_KEY and CUB_SUPPORTS_UNIQUE_BY_KEY (#164637) 2025-10-18 20:05:54 +00:00
EmbeddingBackwardKernel.cu Remove CUDA 11 workarounds for CUB_SUPPORTS_SCAN_BY_KEY and CUB_SUPPORTS_UNIQUE_BY_KEY (#164637) 2025-10-18 20:05:54 +00:00
EmbeddingBackwardKernel.cuh
EmbeddingBag.cu Remove CUDA 11 workarounds for CUB_SUPPORTS_SCAN_BY_KEY and CUB_SUPPORTS_UNIQUE_BY_KEY (#164637) 2025-10-18 20:05:54 +00:00
Equal.cpp
FillKernel.cu
FlattenIndicesKernel.cu
ForeachBinaryOpList.cu [BE][Ez]: Prevent copies of std::vector in CUDA ForeachOps (#163416) 2025-09-21 05:24:13 +00:00
ForeachBinaryOpScalar.cu [BE][Ez]: Prevent copies of std::vector in CUDA ForeachOps (#163416) 2025-09-21 05:24:13 +00:00
ForeachBinaryOpScalarList.cu [BE][Ez]: Prevent copies of std::vector in CUDA ForeachOps (#163416) 2025-09-21 05:24:13 +00:00
ForeachBinaryOpScalarTensor.cu [BE][Ez]: Prevent copies of std::vector in CUDA ForeachOps (#163416) 2025-09-21 05:24:13 +00:00
ForeachFunctors.cuh Revert "handling special case for pow(3) for GPU (#157537)" 2025-08-19 22:57:45 +00:00
ForeachMinMaxFunctors.cuh
ForeachPointwiseOp.cu [BE][Ez]: Prevent copies of std::vector in CUDA ForeachOps (#163416) 2025-09-21 05:24:13 +00:00
ForeachReduceOp.cu chunk_size should always be int64_t for Foreach functors (#156872) 2025-06-27 22:35:34 +00:00
ForeachTernaryOp.cu [BE][Ez]: Prevent copies of std::vector in CUDA ForeachOps (#163416) 2025-09-21 05:24:13 +00:00
ForeachUnaryOp.cu [BE][Ez]: Prevent copies of std::vector in CUDA ForeachOps (#163416) 2025-09-21 05:24:13 +00:00
FractionalMaxPool2d.cu
FractionalMaxPool3d.cu [CUDA][64-bit indexing] Fix some existing problematic int64_t _ = blockIdx.* * blockDim.* code (#142010) 2024-12-19 00:55:11 +00:00
FunctionOfAMatrixUtilsKernel.cu
fused_adagrad_impl.cu Split out C++ code from fused adagrad PR (#159008) 2025-07-26 00:36:59 +00:00
fused_adagrad_impl.cuh Split out C++ code from fused adagrad PR (#159008) 2025-07-26 00:36:59 +00:00
fused_adagrad_utils.cuh [BugFix] chunk_size should always be int64_t (#165971) 2025-10-21 19:52:47 +00:00
fused_adam_amsgrad_impl.cu
fused_adam_amsgrad_impl.cuh
fused_adam_impl.cu
fused_adam_impl.cuh
fused_adam_utils.cuh chunk_size should always be int64_t for Foreach functors (#156872) 2025-06-27 22:35:34 +00:00
fused_adamw_amsgrad_impl.cu
fused_adamw_amsgrad_impl.cuh
fused_adamw_impl.cu
fused_adamw_impl.cuh
FusedAdagradKernel.cu Split out C++ code from fused adagrad PR (#159008) 2025-07-26 00:36:59 +00:00
FusedAdamKernel.cu [5/N] Apply bugprone-unchecked-optional-access (#143111) 2024-12-15 01:07:28 +00:00
FusedAdamWKernel.cu [5/N] Apply bugprone-unchecked-optional-access (#143111) 2024-12-15 01:07:28 +00:00
FusedSgdKernel.cu chunk_size should always be int64_t for Foreach functors (#156872) 2025-06-27 22:35:34 +00:00
GcdLcmKernel.cu
GridSampler.cpp
GridSampler.cu
GridSampler.cuh
GridSampler.h Modernize C++ code in aten/src/ATen/ (#141424) 2024-11-24 02:15:19 +00:00
group_norm_kernel.cu [CUDA][64-bit indexing] Fix some existing problematic int64_t _ = blockIdx.* * blockDim.* code (#142010) 2024-12-19 00:55:11 +00:00
GroupedBlas.cpp Add MXFP4 grouped gemm support via. FBGEMM kernels (#166530) 2025-10-30 16:46:11 +00:00
GroupMM.cu improve shape checks for grouped_mm (#159666) 2025-08-02 00:12:25 +00:00
GroupMM.h bf16 grouped gemm (#150374) 2025-04-06 04:53:24 +00:00
GroupMMCommon.cuh improve shape checks for grouped_mm (#159666) 2025-08-02 00:12:25 +00:00
hermite_polynomial_h.cu
hermite_polynomial_he.cu
IGammaKernel.cu Turn some const variables into constexpr in C++ code (#165401) 2025-10-17 13:24:46 +00:00
Im2Col.cu
im2col.cuh [BE] Remove unusued channels arg in col2im (#142336) 2024-12-09 01:49:41 +00:00
Indexing.cu [ROCm] Adjust grid size for non-unit stride backwards indexing (#165026) 2025-10-10 16:36:38 +00:00
IndexKernel.cpp [4/N] Avoid copy in std::get (#142285) 2024-12-09 07:59:35 +00:00
IndexKernel.cu [CUDA] fix indexing on large tensor causing nvalid configuration argument (#164049) 2025-09-29 06:07:35 +00:00
IndexKernel.h Modernize C++ code in aten/src/ATen/ (#141424) 2024-11-24 02:15:19 +00:00
IndexKernelUtils.cu Add a compile-time flag to trigger verbose logging for device-side asserts (#166171) 2025-10-30 19:43:46 +00:00
IndexKernelUtils.h Support more dtypes for input, indices in gather (#151822) 2025-05-01 16:35:23 +00:00
int4mm.cu Remove old ROCm version checks and branches (#166111) 2025-10-27 05:32:54 +00:00
int8mm.cu [WOQ] Integrate CUDA support for int8pack_mm woq optimization pattern (#161680) 2025-09-17 10:24:13 +00:00
jit_utils.cpp [1/N][Fix] Fix typo in aten folder (#166126) 2025-10-27 15:34:39 +00:00
jit_utils.h add the torch.float8_e8m0fnu dtype to PyTorch (#147466) 2025-02-20 13:55:42 +00:00
JitLoops.cuh
KernelUtils.cuh Remove old ROCm version checks and branches (#166111) 2025-10-27 05:32:54 +00:00
laguerre_polynomial_l.cu
LaunchUtils.h
layer_norm_kernel.cu [ROCm] Disable __builtin_amdgcn_rcpf for gfx90a (#166454) 2025-10-30 23:39:00 +00:00
legendre_polynomial_p.cu
Lerp.cu Fix torch.lerp RuntimeError when weight is CPU scalar while input & end are CUDA tensor (#141820) 2024-12-09 18:14:54 +00:00
LinearAlgebra.cu
LinearAlgebraStubs.cpp [1/N][Fix] Fix typo in aten folder (#166126) 2025-10-27 15:34:39 +00:00
LogAddExpKernel.cu
LogcumsumexpKernel.cu Remove old workaround in launch_logcumsumexp_cuda_kernel (#164567) 2025-10-03 18:07:02 +00:00
Loops.cuh Simplify c10::guts::apply (#164566) 2025-10-22 00:47:43 +00:00
Loss.cu Removed ROCM ifdef that governs thread count + smem parallel reduction. (#149779) 2025-03-29 04:27:54 +00:00
LossCTC.cu [CUDA] Decrease launch bounds of CTCLoss backward for blackwell (#159522) 2025-08-05 19:26:25 +00:00
Math.cuh Turn some const variables into constexpr in C++ code (#165401) 2025-10-17 13:24:46 +00:00
MaxMinElementwiseKernel.cu
MaxUnpooling.cu [BUG] MaxUnpool2d/3d should check output dim before accessing its elements (#163507) 2025-09-22 21:36:48 +00:00
MemoryAccess.cuh [ROCm] Improve vectorized elementwise kernel performance in MI300X (#153634) 2025-05-27 20:49:32 +00:00
MiscUtils.h Enable modernize-use-default-member-init (#149046) 2025-04-09 11:57:24 +00:00
MixedDtypesLinear.cu Remove outdated CUDA 11 conditions (#154313) 2025-05-28 08:44:58 +00:00
modified_bessel_i1.cu
modified_bessel_i0.cu
modified_bessel_k1.cu
modified_bessel_k0.cu
MultiLabelMarginCriterion.cu
MultiMarginLoss.cu [CUDA] Fix missing __syncthreads in MultiMarginLoss backward (#158994) 2025-07-24 20:47:29 +00:00
MultinomialKernel.cu [ROCm] Remove use of warpsize on host-side compilation (#156979) 2025-07-01 04:55:31 +00:00
MultiTensorApply.cuh
NaiveConvolutionTranspose2d.cu
NaiveConvolutionTranspose3d.cu
NaiveDilatedConvolution.cu
NLLLoss2d.cu [cuda] fix nll_loss2d backward bounds check with reduction=none (#165247) 2025-10-20 06:25:11 +00:00
Nonzero.cu Remove C++ and test branches for CUDA<12 (#163443) 2025-09-22 18:20:08 +00:00
Normalization.cu Add assertion to align with cuda (#153233) 2025-05-23 07:32:43 +00:00
Normalization.cuh Use std::min for #166021 (#166195) 2025-10-27 17:57:44 +00:00
PersistentSoftmax.cuh Improve softmax's perf in cuda (#144679) 2025-01-23 00:02:57 +00:00
PointwiseOpsKernel.cu Remove outdated CUDA 11 conditions (#154313) 2025-05-28 08:44:58 +00:00
Pow.cuh Workaround ATen SFINAE under libc++ (#161101) 2025-08-21 00:55:58 +00:00
PowKernel.cu Revert "handling special case for pow(3) for GPU (#157537)" 2025-08-19 22:57:45 +00:00
Randperm.cu
Randperm.cuh [4/N] Avoid copy in std::get (#142285) 2024-12-09 07:59:35 +00:00
RangeFactories.cu [CUDA][MPS] Fix torch.arange bound validation for large float inputs (#154320) 2025-06-05 14:51:25 +00:00
RecordStream.cu
Reduce.cu
Reduce.cuh [ATen] Fix CUDA reduction warp shuffle order (#164790) 2025-10-21 00:09:13 +00:00
ReduceAMinMaxKernel.cu
ReduceArgMaxKernel.cu
ReduceArgMinKernel.cu
ReduceLogicKernel.cu
ReduceMaxValuesKernel.cu
ReduceMinValuesKernel.cu
ReduceMomentKernel.cu [ATen] Vectorize 8 elements on 16 bit data types for sum/mean (#165055) 2025-10-17 13:39:36 +00:00
ReduceNormKernel.cu
ReduceOps.cpp
ReduceOps.h Modernize C++ code in aten/src/ATen/ (#141424) 2024-11-24 02:15:19 +00:00
ReduceSumProdKernel.cu [ATen] Vectorize 8 elements on 16 bit data types for sum/mean (#165055) 2025-10-17 13:39:36 +00:00
reduction_template.cuh [ATen] Fix CUDA reduction warp shuffle order (#164790) 2025-10-21 00:09:13 +00:00
ReflectionPad.cu [CUDA] fix reflection padding for large batch size (#165942) 2025-10-21 21:07:38 +00:00
RenormKernel.cu
Repeat.cu Add CUDA_KERNEL_ASSERT_PRINTF, a more flexible CUDA_KERNEL_ASSERT_MSG (#160129) 2025-09-16 00:23:48 +00:00
ReplicationPadding.cu [CUDA][64-bit indexing] Fix some existing problematic int64_t _ = blockIdx.* * blockDim.* code (#142010) 2024-12-19 00:55:11 +00:00
Resize.cpp
Resize.h Enable modernize-use-default-member-init (#149046) 2025-04-09 11:57:24 +00:00
RNN.cu [ROCm] missing AT_CUDA_CHECK for cub and SoftMax (#149883) 2025-03-25 23:22:32 +00:00
RowwiseScaledMM.cu [cutlass] Prep for cutlass upgrade by ignoring Wunused-but-set-variable (#159276) 2025-07-29 04:40:24 +00:00
RowwiseScaledMM.h
RreluWithNoise.cu [4/N] Avoid copy in std::get (#142285) 2024-12-09 07:59:35 +00:00
scaled_modified_bessel_k1.cu
scaled_modified_bessel_k0.cu
ScaledBlas.cpp Revert "Add CUDA MXFP4 scaled mm support via. FBGEMM (#166526)" 2025-10-31 21:10:28 +00:00
ScaledGroupMM.cu improve shape checks for grouped_mm (#159666) 2025-08-02 00:12:25 +00:00
ScaledGroupMM.h [WIP] Initial implementation of Grouped Gemm API (#148531) 2025-03-11 21:49:46 +00:00
ScanKernels.cpp Implement deterministic scan (#140887) 2024-11-19 23:43:26 +00:00
ScanKernels.h
ScanUtils.cuh Remove outdated CUDA 11 conditions (#154313) 2025-05-28 08:44:58 +00:00
ScatterGatherKernel.cu Add a compile-time flag to trigger verbose logging for device-side asserts (#166171) 2025-10-30 19:43:46 +00:00
SegmentReduce.cu [CD] Add CUDA 13.0 Windows build (#161663) 2025-09-01 15:27:17 +00:00
Shape.cu Fix: nDims is mutated inside the loop in Shape.cu (#165446) 2025-10-15 02:32:15 +00:00
shifted_chebyshev_polynomial_t.cu
shifted_chebyshev_polynomial_u.cu
shifted_chebyshev_polynomial_v.cu
shifted_chebyshev_polynomial_w.cu
SoftMax.cu [ROCm] Remove use of warpsize on host-side compilation (#156979) 2025-07-01 04:55:31 +00:00
Sort.cpp [ROCm] Fix sort for non-standard bool (#147459) 2025-03-06 00:23:02 +00:00
Sort.cu [ROCm] Use IPT=8 for block radix sort (#147657) 2025-02-26 04:22:16 +00:00
Sort.h
SortImpl.cu
Sorting.cpp Fix race condition and make CUDA kthvalue deterministic (#165762) 2025-10-25 00:45:57 +00:00
Sorting.cu Fix race condition and make CUDA kthvalue deterministic (#165762) 2025-10-25 00:45:57 +00:00
Sorting.h Modernize C++ code in aten/src/ATen/ (#141424) 2024-11-24 02:15:19 +00:00
SortingCommon.cuh Recover non-standard bool test for msort (#139870) 2024-11-11 02:00:34 +00:00
SortingRadixSelect.cuh
SortStable.cu Allow at::native::offset_t to be offset using operator+= (#164570) 2025-10-15 01:40:54 +00:00
SortStable.h
SortUtils.cuh
SparseBinaryOpIntersectionKernel.cu
SparseMM.cu
SpectralOps.cpp Remove unnecessary "static" for definitions in anonymous namespace (#165035) 2025-10-11 00:04:23 +00:00
SpectralOps.cu [1/N] Remove inclusion of ATen/core/Array.h (#122064) 2024-11-18 08:50:28 +00:00
spherical_bessel_j0.cu
StepKernel.cu
SummaryOps.cu Non-deterministic alert in histc_cuda for floating types only (#151701) 2025-04-24 21:16:46 +00:00
TensorCompare.cpp
TensorCompare.cu Add FP8 support for eye (#139974) 2024-12-24 10:00:23 +00:00
TensorFactories.cu [CUDA][64-bit indexing] Fix some existing problematic int64_t _ = blockIdx.* * blockDim.* code (#142010) 2024-12-19 00:55:11 +00:00
TensorModeKernel.cpp
TensorModeKernel.cu [ROCm] Remove use of warpsize on host-side compilation (#156979) 2025-07-01 04:55:31 +00:00
TensorModeKernel.cuh Remove outdated CUDA 11 conditions (#154313) 2025-05-28 08:44:58 +00:00
TensorModeKernel.h Modernize C++ code in aten/src/ATen/ (#141424) 2024-11-24 02:15:19 +00:00
TensorShape.cu Make torch._chunk_cat support non-contiguous inputs (#151263) 2025-04-16 04:18:46 +00:00
TensorShapeCUDA.cpp
TensorTopK.cpp Remove CUDA 11 workarounds for CUB_SUPPORTS_SCAN_BY_KEY and CUB_SUPPORTS_UNIQUE_BY_KEY (#164637) 2025-10-18 20:05:54 +00:00
TensorTopK.cu Remove CUDA 11 workarounds for CUB_SUPPORTS_SCAN_BY_KEY and CUB_SUPPORTS_UNIQUE_BY_KEY (#164637) 2025-10-18 20:05:54 +00:00
TensorTopK.h Modernize C++ code in aten/src/ATen/ (#141424) 2024-11-24 02:15:19 +00:00
TensorTransformations.cu [CUDA][64-bit indexing] Fix some existing problematic int64_t _ = blockIdx.* * blockDim.* code (#142010) 2024-12-19 00:55:11 +00:00
thread_constants.h Revert "[CUDA] Only use vec128 if CUDA version is newer than 12.8 (#150705)" 2025-04-08 16:29:05 +00:00
TriangularOps.cu [cuda] fix triu/tril int32 overflow for large matrices (#164705) 2025-10-20 07:17:41 +00:00
UnaryComplexKernels.cu
UnaryFractionKernels.cu
UnaryGammaKernels.cu
UnaryGeometricAcoshKernel.cu
UnaryGeometricAcosKernel.cu
UnaryGeometricAsinhKernel.cu
UnaryGeometricAsinKernel.cu
UnaryGeometricAtanhKernel.cu
UnaryGeometricAtanKernel.cu
UnaryGeometricCoshKernel.cu
UnaryGeometricCosKernel.cu
UnaryGeometricSinhKernel.cu
UnaryGeometricSinKernel.cu
UnaryGeometricTanhKernel.cu disable jiterator for complex tan and tanh (#165250) 2025-10-29 04:59:01 +00:00
UnaryGeometricTanKernel.cu disable jiterator for complex tan and tanh (#165250) 2025-10-29 04:59:01 +00:00
UnaryLogKernels.cu
UnaryOpsKernel.cu
UnarySignKernels.cu
UnarySpecialOpsKernel.cu
UnfoldBackwardKernel.cu
Unique.cu
UniqueCub.cu [ATen][CUDA][CUB] Implement changes to CCCL (CUB/Thrust/LibCUDACXX) usage in ATen (#153373) 2025-06-28 05:44:52 +00:00
UniqueCub.cuh
UpSample.cuh Turn some const variables into constexpr in C++ code (#165401) 2025-10-17 13:24:46 +00:00
UpSampleBicubic2d.cu
UpSampleBilinear2d.cu [ROCm] new implementation of upsample_bilinear2d_backward (#164572) 2025-10-25 02:39:24 +00:00
UpSampleLinear1d.cu
UpSampleNearest1d.cu
UpSampleNearest2d.cu [64-bit][CUDA] Upsample2D 64-bit indexing fix attempt 2 (#141923) 2025-01-04 02:30:38 +00:00
UpSampleNearest3d.cu [64-bit] Int64 casting for UpSampleNearest3D (#144865) 2025-01-29 19:30:09 +00:00
UpSampleTrilinear3d.cu
ValidateCompressedIndicesKernel.cu
vol2col.cuh
WeightNorm.cu
ZetaKernel.cu