pytorch/torch/ao/quantization/pt2e
Xia, Weiwen 3b0cd9b542 [Quant][PT2E] add a lowering pass for x86 backend (#149708)
**Summary**
This PR adds a lowering pass for x86 backend
- Patterns of `dequantize -> conv/linear (-> quantize)` are fused to corresponding quantized onednn ops.
- Weights are prepacked ahead of time.
- Post ops of conv/linear are fused if supported.
- The pass returns a `GraphModule` with the modifications mentioned above.

**Test plan**
```
pytest test/quantization/pt2e/test_x86inductor_quantizer.py -k test_lowering_to_x86
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149708
Approved by: https://github.com/jerryzh168, https://github.com/leslie-fang-intel
2025-04-01 17:32:41 +00:00
..
representation Migrate from Tuple -> tuple in torch/ao (#144265) 2025-01-10 00:12:06 +00:00
__init__.py
_affine_quantization.py fix pt2e block wise quantization test (#147035) 2025-02-13 19:44:56 +00:00
_numeric_debugger.py PEP585 update - torch/ao/quantization (#145140) 2025-01-19 10:20:00 +00:00
duplicate_dq_pass.py
export_utils.py [reland] Kill capture_pre_autograd_graph API (#143426) 2024-12-18 12:07:09 +00:00
graph_utils.py PEP585 update - torch/ao/quantization (#145140) 2025-01-19 10:20:00 +00:00
lowering.py [Quant][PT2E] add a lowering pass for x86 backend (#149708) 2025-04-01 17:32:41 +00:00
port_metadata_pass.py
prepare.py PEP585 update - torch/ao/quantization (#145140) 2025-01-19 10:20:00 +00:00
qat_utils.py PEP585 update - torch/ao/quantization (#145140) 2025-01-19 10:20:00 +00:00
utils.py patch for block-wise quantization + pt2e (#146946) 2025-02-18 01:15:26 +00:00