mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-07 12:21:27 +01:00
Scenario:
```
>>> nodes
IterationRangesEntry(
x2,
divisor=192*u0 + 192576,
length=s1,
(xindex//(192*u0 + 192576)),
{x0: 192, x1: u0 + 1003, x2: s1, x3: 192*s1*u0 + 192576*s1, x4: 192*u0 + 192576})
IterationRangesEntry(
x1,
divisor=192,
length=u0 + 1003,
ModularIndexing(xindex, 192, u0 + 1003),
{x0: 192, x1: u0 + 1003, x2: s1, x3: 192*s1*u0 + 192576*s1, x4: 192*u0 + 192576})
IterationRangesEntry(
x0,
divisor=1,
length=192,
ModularIndexing(xindex, 1, 192),
{x0: 192, x1: u0 + 1003, x2: s1, x3: 192*s1*u0 + 192576*s1, x4: 192*u0 + 192576})
```
Think about whether using fallback is safe here. I think it's safe because the divisor of one IterationRangesEntry should be the product of the lengths of the preceding IterationRangesEntry? Unless, one of the lengths divides by an unbacked symint?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130595
Approved by: https://github.com/aakhundov, https://github.com/ezyang
|
||
|---|---|---|
| .. | ||
| aoti_runtime | ||
| cuda | ||
| rocm | ||
| xpu | ||
| __init__.py | ||
| aoti_hipify_utils.py | ||
| codegen_device_driver.py | ||
| common.py | ||
| cpp_gemm_template.py | ||
| cpp_micro_gemm.py | ||
| cpp_prefix.h | ||
| cpp_template_kernel.py | ||
| cpp_template.py | ||
| cpp_utils.py | ||
| cpp_wrapper_cpu.py | ||
| cpp_wrapper_cuda.py | ||
| cpp.py | ||
| cuda_combined_scheduling.py | ||
| halide.py | ||
| memory_planning.py | ||
| multi_kernel.py | ||
| simd.py | ||
| triton_foreach.py | ||
| triton_split_scan.py | ||
| triton_utils.py | ||
| triton.py | ||
| wrapper.py | ||