Commit Graph

614 Commits

Author SHA1 Message Date
Maggie Moss
c7eee49525 Fix pyrefly ignores 1/n (#166239)
First diff adjusting the syntax for pyrefly: ignore suppressions so they only hide one class of type error.

Test:
lintrunner
pyrefly check

Pull Request resolved: https://github.com/pytorch/pytorch/pull/166239
Approved by: https://github.com/oulgen
2025-10-26 00:44:10 +00:00
Yuanyuan Chen
fdab48a7c1 Enable all PIE rules on ruff (#165814)
This PR enables all PIE rules on ruff, there are already some enabled rules from this family, the new added rules are
```
PIE796  Enum contains duplicate value: {value}
PIE808  Unnecessary start argument in range
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/165814
Approved by: https://github.com/ezyang
2025-10-18 07:36:18 +00:00
PyTorch MergeBot
24520b8386 Revert "Enable all PIE rules on ruff (#165814)"
This reverts commit c79dfdc655.

Reverted https://github.com/pytorch/pytorch/pull/165814 on behalf of https://github.com/cyyever due to Need to cover more files ([comment](https://github.com/pytorch/pytorch/pull/165814#issuecomment-3417931863))
2025-10-18 07:21:08 +00:00
Yuanyuan Chen
c79dfdc655 Enable all PIE rules on ruff (#165814)
This PR enables all PIE rules on ruff, there are already some enabled rules from this family, the new added rules are
```
PIE796  Enum contains duplicate value: {value}
PIE808  Unnecessary start argument in range
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/165814
Approved by: https://github.com/ezyang
2025-10-18 06:40:12 +00:00
Yuanyuan Chen
fbe0d20a17 [2/N] More ruff SIM fixes (#165031)
This is follow-up of #164695 to apply ruff SIM rules to more files. Most changes are about simplifying dict.get because None is already the default value.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/165031
Approved by: https://github.com/mlazos
2025-10-14 14:22:54 +00:00
can-gaa-hou
39161e73fc [Fix] missing lambda in torch._check (#165043)
Fixes more missing lambda in torch._check in the source code. Inspired by #164225.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/165043
Approved by: https://github.com/FFFrog, https://github.com/Skylion007
2025-10-10 17:11:55 +00:00
PyTorch MergeBot
b8be796a57 Revert "[2/N] More ruff SIM fixes (#165031)"
This reverts commit 38095fbd13.

Reverted https://github.com/pytorch/pytorch/pull/165031 on behalf of https://github.com/albanD due to One of the changed line started to fail on trunk ([comment](https://github.com/pytorch/pytorch/pull/165031#issuecomment-3390190870))
2025-10-10 13:42:14 +00:00
Yuanyuan Chen
38095fbd13 [2/N] More ruff SIM fixes (#165031)
This is follow-up of #164695 to apply ruff SIM rules to more files. Most changes are about simplifying dict.get because None is already the default value.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/165031
Approved by: https://github.com/mlazos
2025-10-10 05:37:46 +00:00
Laith Sakka
7158aa22e8 remove more (#164753)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164753
Approved by: https://github.com/aorenste, https://github.com/mlazos
ghstack dependencies: #164664, #164665, #164667, #164668
2025-10-08 14:23:38 +00:00
Colin Peppler
2855a045b3 Use sym_eq and sym_and on symbolic shapes in common_meta_baddbmm_bmm (#164781)
Differential Revision: D84005053

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164781
Approved by: https://github.com/Skylion007
2025-10-07 18:25:00 +00:00
PyTorch MergeBot
5d7360bb03 Revert "Enable all SIM rules except disabled ones (#164645)"
This reverts commit 321e602692.

Reverted https://github.com/pytorch/pytorch/pull/164645 on behalf of https://github.com/izaitsevfb due to causes lint failures ([comment](https://github.com/pytorch/pytorch/pull/164645#issuecomment-3369274351))
2025-10-05 19:32:21 +00:00
Yuanyuan Chen
321e602692 Enable all SIM rules except disabled ones (#164645)
`SIM` rules are useful for simplifying boolean expressions and enhances code readability.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164645
Approved by: https://github.com/ezyang
2025-10-05 07:38:25 +00:00
Maggie Moss
1051c1de5c Add pyrefly suppressions 2/n (#164513)
Adds suppressions to pyrefly will typecheck clean: https://github.com/pytorch/pytorch/issues/163283

Test plan:
dmypy restart && python3 scripts/lintrunner.py -a
pyrefly check

---
step 1: uncomment lines in the `pyrefly.toml` file
before: https://gist.github.com/maggiemoss/911b4d0bc88bf8cf3ab91f67184e9d46

after:
```
 INFO Checking project configured at `/Users/maggiemoss/python_projects/pytorch/pyrefly.toml`
 INFO 0 errors (1,152 ignored)
 ```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164513
Approved by: https://github.com/oulgen
2025-10-03 02:46:13 +00:00
Yuanyuan Chen
315ffdc1e4 [4/N] Apply ruff UP035 rule to python code (#164206)
Follows #164104

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164206
Approved by: https://github.com/albanD
2025-10-01 19:05:53 +00:00
Yuanyuan Chen
cc8b14d09a [2/N] Simplify "in" operation for containers of a single item (#164323)
These issues are detected by ruff [FURB171](https://docs.astral.sh/ruff/rules/single-item-membership-test/#single-item-membership-test-furb171).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164323
Approved by: https://github.com/justinchuby, https://github.com/Skylion007
2025-10-01 05:39:11 +00:00
ankushwahaRH
7f29c47a4f Fix cdist export compute mode validation (#161724)
Fixes #161089. Added '0' as the acceptable value for compute mode in _meta_registrations.py. Also, added a test case in test_export.py file.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161724
Approved by: https://github.com/albanD, https://github.com/angelayi
2025-09-30 12:23:20 +00:00
Yavuz Yetim
7afcb030d8 Back out "Revert D81959389" (#163905)
Summary:
Original commit changeset: 06888d7ebff0

Original Phabricator Diff: D82932788

Restricted the test to SM90 for scaled_grouped_mm

Test Plan: TBD (will share the linux CI results)

Differential Revision: D83283991

Pull Request resolved: https://github.com/pytorch/pytorch/pull/163905
Approved by: https://github.com/angelayi
2025-09-30 07:05:13 +00:00
can-gaa-hou
eb4361a801 [Fix] Adding missing f prefixes to formatted strings [1/N] (#164065)
As stated in the title.

* #164068
* #164067
* #164066
* __->__ #164065

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164065
Approved by: https://github.com/Skylion007
2025-09-29 04:53:00 +00:00
thenumberouscode
c106ee8515 [FakeTensor] Supplement the relevant logic for converting conv1d to conv2d in meta_conv (#160408)
## Fixes https://github.com/pytorch/pytorch/issues/159462 also fixes #163569 , #163604

## summary
the issue is caused by the wrong stride of conv1d's result generated by meta_conv:
4d5b3f2d5a/torch/_meta_registrations.py (L2453-L2471)

and the wrong stride will be used to codegen size assert in inductor:
4d5b3f2d5a/torch/_inductor/ir.py (L6152-L6163)

## reason
So why the computed stride is wrong in the meta_conv function? because the corresponding backend will convert conv1d to conv2d and change the input tensor' size and memory_format(channel last). but the meta_conv do not do this transformation, so a mismatch happend.
4d5b3f2d5a/aten/src/ATen/native/Convolution.cpp (L1502-L1510)
 just add corresponding logic in meta_conv.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160408
Approved by: https://github.com/eellison, https://github.com/jansel, https://github.com/mlazos
2025-09-26 15:45:02 +00:00
Yidi Wu
21a41edd4f Add fake_impl for _native_multi_head_attention (#163700)
Test Plan: See added test in test_export.py

Differential Revision: D83099187

Pull Request resolved: https://github.com/pytorch/pytorch/pull/163700
Approved by: https://github.com/angelayi
2025-09-25 19:01:27 +00:00
Jason Ansel
9c4d9f940b [inductor] Support out_dtype arg to matmul (#163393)
Fixes #163275

Pull Request resolved: https://github.com/pytorch/pytorch/pull/163393
Approved by: https://github.com/eellison, https://github.com/coconutruben
ghstack dependencies: #163386, #163398, #163387, #163414, #163415, #163419, #163434
2025-09-23 15:37:38 +00:00
PyTorch MergeBot
aff76c046d Revert "Add fake_impl for _native_multi_head_attention (#163167)"
This reverts commit 27164b6788.

Reverted https://github.com/pytorch/pytorch/pull/163167 on behalf of https://github.com/malfet due to This broke in inductor-cpu-test, see 1a42656d6c/1 ([comment](https://github.com/pytorch/pytorch/pull/163167#issuecomment-3324302026))
2025-09-23 14:36:45 +00:00
Yidi Wu
27164b6788 Add fake_impl for _native_multi_head_attention (#163167)
Test Plan:
See added test in test_export.py

Rollback Plan:

Reviewed By: henryoier

Differential Revision: D77747446

Pull Request resolved: https://github.com/pytorch/pytorch/pull/163167
Approved by: https://github.com/angelayi
2025-09-23 04:02:20 +00:00
Aart J.C. Bik
9b5ec0ff7c Use computed buffer sizes of torch for cusparseLt metadata (#163125)
Making sure buffer allocation matches what is computed by cusparseLt compression

Pull Request resolved: https://github.com/pytorch/pytorch/pull/163125
Approved by: https://github.com/jcaip
2025-09-19 22:12:40 +00:00
Eddie Yan
9b7a8c4d05 [cuDNN][SDPA][submodule] Roll-back cuDNN frontend upgrade, update Meta registration (#163104)
For https://github.com/pytorch/torchtitan/issues/1713

Also note that we will need to rollback the cuDNN frontend upgrade in 2.9 as it currently introduces a segmentation fault by assuming tensors have their strides and sizes populated at graph creation time 1a7b4b78db/include/cudnn_frontend/node/sdpa_support_surface.h (L447%C2%A0)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/163104
Approved by: https://github.com/drisspg
2025-09-17 15:48:54 +00:00
Daniel Vega-Myhre
872ed60679 [mxfp8 torch._scaled_grouped_mm] fix meta registration for 3d tensor (#162765)
Meta registration checks for torch._scaled_grouped_mm has a bug for 3d "B" tensors. Namely, the scale shape for such a tensor should be 2d with shape (G, blocked_K * blocked_N), but it currently enforces an expected 3d shape of (G, blocked_K, blocked_N).

See Blas.cpp for correct validation logic [here](8e217a9f6d/aten/src/ATen/native/cuda/Blas.cpp (L1622)).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162765
Approved by: https://github.com/ngimel
2025-09-12 03:51:52 +00:00
Pian Pawakapan
ac72f81c12 [dynamic shapes] unbacked-safe should_swap (#160473)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160473
Approved by: https://github.com/laithsakka
2025-09-11 18:51:25 +00:00
Daniel Vega-Myhre
b6d0a9ea90 MXFP8 grouped GEMM support for torch._scaled_grouped_mm + submodule bump (#162209)
## Summary
- We just landed 2d-2d support for mxfp8 grouped gemm in FBGEMM: https://github.com/pytorch/FBGEMM/pull/4816
- This is needed for backward pass of mxfp8 MoE training with grouped gemms
- Changes:
    - Add dispatching + input validation for mxfp8 grouped gemm in `torch._scaled_grouped_mm`
    - Add meta registration input validation for mxfp8 grouped gemm, for composability with compile
    - Add unit tests exercising torch._scaled_grouped_mm with mxfp8 inputs
    - Bump FBGEMM third party submodule to include:
          - https://github.com/pytorch/FBGEMM/pull/4816
          - https://github.com/pytorch/FBGEMM/pull/4820
          - https://github.com/pytorch/FBGEMM/pull/4821
          - https://github.com/pytorch/FBGEMM/pull/4823

#### How fbgemm dependency was bumped
Documenting this since I haven't found it documented elsewhere:
- `cd ~/pytorch/third_party/fbgemm`
- `git fetch`
- `git checkout <hash>`
- `cd ~/pytorch`
- `git add third_party/fbgemm`

## Test plan

#### Test build
```
USE_FBGEMM_GENAI=1 python -m pip install --no-build-isolation -v -e .
...
Successfully installed torch-2.9.0a0+gitf5070f3
```
[full build log](https://www.internalfb.com/phabricator/paste/view/P1933787581)

#### Unit tests
```
pytest test/test_matmul_cuda.py -k test_mxfp8_scaled_grouped_mm_
...

test/test_matmul_cuda.py .........                                                                                                                        [100%]

============================================================== 9 passed, 1668 deselected in 5.34s ===============================================================
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162209
Approved by: https://github.com/ngimel
2025-09-06 15:25:30 +00:00
Laith Sakka
fbf3d2027d use sym_or instead of any to avoid dde in calc_conv_nd_return_shape (#162084)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162084
Approved by: https://github.com/aorenste

Co-authored-by: Aaron Orenstein <aorenste@fb.com>
2025-09-04 01:20:22 +00:00
angelayi
e34b6a0103 Add meta for add.Scalar (#161332)
Fixes https://github.com/pytorch/pytorch/issues/161076

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161332
Approved by: https://github.com/Skylion007
2025-08-26 02:26:51 +00:00
Isuru Fernando
e631557518 Fix meta function for aten.complex (#160894)
Closes https://github.com/pytorch/pytorch/issues/160882

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160894
Approved by: https://github.com/mlazos
2025-08-20 16:30:04 +00:00
Isuru Fernando
781e9a7724 Fix meta for constant_pad_nd (#159878)
Fixes https://github.com/pytorch/pytorch/issues/144187

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159878
Approved by: https://github.com/Skylion007, https://github.com/ezyang
2025-08-14 14:47:47 +00:00
angelayi
74a754aae9 Add meta kernel for sdpa_math_for_mps (#159695)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/159695
Approved by: https://github.com/malfet
ghstack dependencies: #159456
2025-08-05 22:27:06 +00:00
Shangdi Yu
bc4b04e058 DeviceCopy should have the same layout as input (#159615)
Summary: Fix https://github.com/pytorch/pytorch/issues/159612

- Fix the meta implementation of `nan_to_num`, it should preserve the stride of the input
- The DeviceCopy IR node should always preserve the input's layout, so we don't end up with a contiguous call during device copy

Test Plan:
```
buck2 run @mode/dev-nosan fbcode//caffe2/test/inductor:test_aot_inductor -- -r test_d2h_copy
```

Rollback Plan:

Differential Revision: D79411407

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159615
Approved by: https://github.com/eellison
2025-08-04 23:56:58 +00:00
Natalia Gimelshein
a81ffbc5f5 improve shape checks for grouped_mm (#159666)
Check that contraction dimension matches between tensors if it's known, and do device-side checks for correct offsets
Pull Request resolved: https://github.com/pytorch/pytorch/pull/159666
Approved by: https://github.com/danielvegamyhre, https://github.com/eqy
2025-08-02 00:12:25 +00:00
Chris Thi
c400c8e2e0 [ROCm] Add FP8 rowwise support to _scaled_grouped_mm + Submodule update (#159075)
Summary:

In this PR we integrate the [FBGEMM AMD FP8 rowwise scaling grouped GEMM kernel](https://github.com/pytorch/FBGEMM/tree/main/fbgemm_gpu/experimental/gen_ai/src/quantize/ck_extensions/fp8_rowwise_grouped) to add support for the `_scaled_grouped_mm` API on AMD. `_scaled_grouped_mm` is [currently supported on Nvidia](9faef3d17c/aten/src/ATen/native/cuda/Blas.cpp (L1614)), this PR aims to bring parity to AMD. Related: [[RFC]: PyTorch Low-Precision GEMMs Public API](https://github.com/pytorch/pytorch/issues/157950#top) #157950.

The kernel is developed using the Composable Kernel framework. Only MI300X is currently supported. In the near future we plan to add support for MI350X as well. For data types we support FP8 e3m4.

The kernel support will be gated with the `USE_FBGEMM_GENAI` flag. We hope to enable this by default for relevant AMD builds.

Note we also update submodule `third_party/fbgemm` to 0adf62831 for the required updates from fbgemm.

Test Plan:

**Hipify & build**
```
python tools/amd_build/build_amd.py
USE_FBGEMM_GENAI=1 python setup.py develop
```

**Unit tests**
```
python test/test_matmul_cuda.py -- TestFP8MatmulCUDA
Ran 488 tests in 32.969s
OK (skipped=454)
```

**Performance Sample**
| G  | M | N | K | Runtime Ms | GB/S | TFLOPS |
| --  | -- | -- | -- | -- | -- | -- |
| 128 | 1 | 2048 | 5120 | 0.37| 3590 | 7.17 |
| 128 | 64 | 2048 | 5120 | 0.51| 2792 | 338.34 |
| 128 | 128 | 2048 | 5120 | 0.66| 2272 | 522.72 |
| 128 | 1 | 5120 | 1024 | 0.21| 3224 | 6.43 |
| 128 | 64 | 5120 | 1024 | 0.29| 2590 | 291.40 |
| 128 | 128 | 5120 | 1024 | 0.40| 2165 | 434.76 |
| 128 | 1 | 4096 | 4096 | 0.69| 3126 | 6.25 |
| 128 | 64 | 4096 | 4096 | 0.85| 2655 | 324.66 |
| 128 | 128 | 4096 | 4096 | 1.10| 2142 | 501.40 |
| 128 | 1 | 8192 | 8192 | 2.45| 3508 | 7.01 |
| 128 | 64 | 8192 | 8192 | 3.27| 2692 | 336.74 |
| 128 | 128 | 8192 | 8192 | 4.04| 2224 | 543.76 |
| 16 | 1 | 2048 | 5120 | 0.04| 3928 | 7.85 |
| 16 | 64 | 2048 | 5120 | 0.05| 3295 | 399.29 |
| 16 | 128 | 2048 | 5120 | 0.07| 2558 | 588.69 |
| 16 | 1 | 5120 | 1024 | 0.03| 3119 | 6.23 |
| 16 | 64 | 5120 | 1024 | 0.03| 2849 | 320.62 |
| 16 | 128 | 5120 | 1024 | 0.05| 2013 | 404.11 |
| 16 | 1 | 4096 | 4096 | 0.06| 4512 | 9.02 |
| 16 | 64 | 4096 | 4096 | 0.09| 3124 | 381.95 |
| 16 | 128 | 4096 | 4096 | 0.13| 2340 | 547.67 |
| 16 | 1 | 8192 | 8192 | 0.32| 3374 | 6.75 |
| 16 | 64 | 8192 | 8192 | 0.42| 2593 | 324.28 |
| 16 | 128 | 8192 | 8192 | 0.53| 2120 | 518.36 |

- Using ROCm 6.4.1
- Collected through `triton.testing.do_bench_cudagraph`

**Binary size with gfx942 arch**
Before: 116103856 Jul 23 14:12 build/lib/libtorch_hip.so
After:  118860960 Jul 23 14:29 build/lib/libtorch_hip.so
The difference is 2757104 bytes (~2.6 MiB).

Reviewers: @drisspg @ngimel @jwfromm @jeffdaily

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159075
Approved by: https://github.com/drisspg
2025-07-30 23:53:58 +00:00
Laith Sakka
aaa384b2d4 move view_meta to fake impl (#158406)
Python dispatcher is not always enabled in fake tensors and have to be called explicitly.
While it should be, it requires some work to get all tests working.

 I have been running in several issues where I add to add enable_python_dispatcher ex
  XLA, Helom ..etc to avoid issues related to that for the view specifically i moved it to fake tensor impl.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/158406
Approved by: https://github.com/bobrenjc93
2025-07-25 08:21:27 +00:00
Laith Sakka
0b2ef76e85 DDE-Free select with unbacked index. (#157605)
When select has data dependent input, we cant tell if the actual index shall be index+size or index.
to avoid throwing dde, we allocate a new unbacked symbol to represent the storage offset of the
output view and we compute its value dynamically at runtime when inductor is lowered.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/157605
Approved by: https://github.com/ColinPeppler
2025-07-24 20:08:05 +00:00
PyTorch MergeBot
23550ab735 Revert "DDE-Free select with unbacked index. (#157605)"
This reverts commit 79d7c754ab.

Reverted https://github.com/pytorch/pytorch/pull/157605 on behalf of https://github.com/laithsakka due to fail pr time benchmarks  ([comment](https://github.com/pytorch/pytorch/pull/157605#issuecomment-3084663020))
2025-07-17 16:20:02 +00:00
Laith Sakka
79d7c754ab DDE-Free select with unbacked index. (#157605)
When select has data dependent input, we cant tell if the actual index shall be index+size or index.
to avoid throwing dde, we allocate a new unbacked symbol to represent the storage offset of the
output view and we compute its value dynamically at runtime when inductor is lowered.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/157605
Approved by: https://github.com/ColinPeppler
2025-07-17 05:08:11 +00:00
Aleksandar Samardžić
90618581e9 Fix grouped MM output strides when compiled but not max-autotuned (#158143)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/158143
Approved by: https://github.com/ngimel
2025-07-15 11:53:13 +00:00
wengshiy
c8c221c0b3 [Inductor][Float8] Add float8_e4m3fn into assertion dtype list. (#157684)
Fix assert issue.
Add float8_e4m3fn into dtype list.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/157684
Approved by: https://github.com/Xia-Weiwen, https://github.com/leslie-fang-intel, https://github.com/jansel
2025-07-15 06:02:01 +00:00
Valentine233
1f57e0e04d [CPU] Support GQA for flash attention (#157893)
As many models require GQA, we support it in flash attention for CPU path.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/157893
Approved by: https://github.com/mingfeima, https://github.com/jansel
2025-07-13 09:49:02 +00:00
Xia, Weiwen
e1a20988f3 [Quant][CPU] Enable fp8 qconv (#157076)
**Summary**
Enable fp8 qconv on CPU. It's part of the plan to enable fp8 static quantization on CPU. This PR only adds FP8 support of the existing int8 qconv op. It does not add a new op nor does it affect frontend or quantization flow. The schema of the qconv op is not changed either.

So, the FP8 qconv shares the same op as INT8 qconv and the difference is that src/wei dtype is fp8 instead of int8. The output dtype can be fp8/float32/bfloat16. The implementation uses the oneDNN library.

Note:
OneDNN does not support quantized fp8 convolution until v3.9 but the version used in PyTorch is v3.7.2. So, the op goes to the reference kernel for now. And we have also update the oneDNN path so that it's compatible with the fp8 dtype. Once oneDNN is upgraded to v3.9 or newer, minimum changes are needed to enable the oneDNN path. And we have ensured that the behavior of the reference kernel is the same as the new oneDNN's implementation.
- oneDNN version < 3.9 (now)
  - Always go to the reference kernel
- oneDNN version >= 3.9 (future)
  - Go to reference kernel on old platforms (without AMX)
  - Use oneDNN on new platforms (with AMX)

**Test plan**
```
pytest test/quantization/core/test_quantized_op.py -k "qconv and fp8"
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/157076
Approved by: https://github.com/leslie-fang-intel, https://github.com/jerryzh168
2025-07-11 10:00:57 +00:00
Aleksandar Samardžić
a3ec6d64b2 Update test after CUTLASS upgrade (#157903)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157903
Approved by: https://github.com/ngimel
2025-07-10 20:10:20 +00:00
Xuehai Pan
4cc8b60d1b [BE][1/16] fix typos in torch/ (#156311)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156311
Approved by: https://github.com/albanD
2025-07-09 11:02:22 +00:00
Laith Sakka
ed5d6d2a20 python definitely_contiguous-> is_contiguous_or_false (#156515)
We probably can avoid having those in python as well and  just depend on c++ impl after we land https://github.com/pytorch/pytorch/pull/155590 but that is for a different PR.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/156515
Approved by: https://github.com/bobrenjc93
2025-06-30 17:31:51 +00:00
PyTorch MergeBot
75a7d9e868 Revert "python definitely_contiguous-> is_contiguous_or_false (#156515)"
This reverts commit 4c0091fda6.

Reverted https://github.com/pytorch/pytorch/pull/156515 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it seems to cause some torch.export failures internally ([comment](https://github.com/pytorch/pytorch/pull/156515#issuecomment-3014104570))
2025-06-27 19:07:06 +00:00
Laith Sakka
cbcffce48a address remaining straight forward gso in meta_registrations (#156902)
Those are all straight forward generalization of existing checks,
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156902
Approved by: https://github.com/ColinPeppler
2025-06-27 06:19:54 +00:00
Laith Sakka
4c0091fda6 python definitely_contiguous-> is_contiguous_or_false (#156515)
We probably can avoid having those in python as well and  just depend on c++ impl after we land https://github.com/pytorch/pytorch/pull/155590 but that is for a different PR.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/156515
Approved by: https://github.com/bobrenjc93
2025-06-26 00:47:14 +00:00