pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

History

Xia, Weiwen 3b0cd9b542 [Quant][PT2E] add a lowering pass for x86 backend (#149708 ) Summary This PR adds a lowering pass for x86 backend - Patterns of `dequantize -> conv/linear (-> quantize)` are fused to corresponding quantized onednn ops. - Weights are prepacked ahead of time. - Post ops of conv/linear are fused if supported. - The pass returns a `GraphModule` with the modifications mentioned above. Test plan ``` pytest test/quantization/pt2e/test_x86inductor_quantizer.py -k test_lowering_to_x86 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/149708 Approved by: https://github.com/jerryzh168, https://github.com/leslie-fang-intel		2025-04-01 17:32:41 +00:00
..
representation	Migrate from Tuple -> tuple in torch/ao (#144265 )	2025-01-10 00:12:06 +00:00
__init__.py
_affine_quantization.py	fix pt2e block wise quantization test (#147035 )	2025-02-13 19:44:56 +00:00
_numeric_debugger.py	PEP585 update - torch/ao/quantization (#145140 )	2025-01-19 10:20:00 +00:00
duplicate_dq_pass.py
export_utils.py	[reland] Kill capture_pre_autograd_graph API (#143426 )	2024-12-18 12:07:09 +00:00
graph_utils.py	PEP585 update - torch/ao/quantization (#145140 )	2025-01-19 10:20:00 +00:00
lowering.py	[Quant][PT2E] add a lowering pass for x86 backend (#149708 )	2025-04-01 17:32:41 +00:00
port_metadata_pass.py
prepare.py	PEP585 update - torch/ao/quantization (#145140 )	2025-01-19 10:20:00 +00:00
qat_utils.py	PEP585 update - torch/ao/quantization (#145140 )	2025-01-19 10:20:00 +00:00
utils.py	patch for block-wise quantization + pt2e (#146946 )	2025-02-18 01:15:26 +00:00