mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-06 12:20:52 +01:00
**Summary** This PR adds a lowering pass for x86 backend - Patterns of `dequantize -> conv/linear (-> quantize)` are fused to corresponding quantized onednn ops. - Weights are prepacked ahead of time. - Post ops of conv/linear are fused if supported. - The pass returns a `GraphModule` with the modifications mentioned above. **Test plan** ``` pytest test/quantization/pt2e/test_x86inductor_quantizer.py -k test_lowering_to_x86 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/149708 Approved by: https://github.com/jerryzh168, https://github.com/leslie-fang-intel |
||
|---|---|---|
| .. | ||
| nn | ||
| ns | ||
| pruning | ||
| quantization | ||
| __init__.py | ||