pytorch/torch/ao/quantization/quantizer
Yan Zhiwei f79b352f5a [Intel GPU] qconv_pointwise.binary XPU support (#135189)
# Motivation
This PR intends to enable quantized fusion `qconv+add` and `qconv+add+relu` at Intel GPU backend.

At backend level, we register the op via schema  `TORCH_SELECTIVE_NAME("onednn::qconv2d_pointwise.binary")` which is the one already defined in `x86InductorQuantzer`

At Inductor level, we have small modification at `torch/_inductor/fx_passes/quantization.py` to allow signed int8 data type(s8) during op lowering. As for the pattern matching, we greatly reuse the code existing at x86InductorQuantizer.

# UT verification
```bash
python test/inductor/test_mkldnn_pattern_matcher.py -v \
   -k test_qconv2d_add_xpu \
   -k test_qconv2d_add_relu_xpu 2>&1
```

# Runtime exemplification
Following is the oneDNN verbose collected from UT
```bash
onednn_verbose,primitive,exec,gpu:0,convolution,jit:ir,forward_training,src_s8::blocked:acdb::f0 wei_s8::blocked:abcd::f0 bia_f32::blocked:a::f0 dst_s8::blocked:acdb::f0,attr-scratchpad:user attr-scales:src0:0:f32+dst:0:f32+wei:1:f32 attr-zero-points:src0:0:s32+dst:0:s32 attr-post-ops:eltwise_linear:1:0.337704+sum:0.0241217+eltwise_relu,alg:convolution_direct,mb1_ic3oc6_ih8oh6kh3sh1dh0ph0_iw8ow6kw3sw1dw0pw0,0.151123
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135189
Approved by: https://github.com/liangan1, https://github.com/EikanWang, https://github.com/guangyey, https://github.com/jerryzh168
ghstack dependencies: #133307

Co-authored-by: guangyey <guangye.yu@intel.com>
2025-02-20 02:02:54 +00:00
..
__init__.py [BE] enable UFMT for torch/ao/quantization/ (#128863) 2024-07-25 04:17:54 +00:00
composable_quantizer.py PEP585 update - torch/ao/quantization (#145140) 2025-01-19 10:20:00 +00:00
embedding_quantizer.py PEP585 update - torch/ao/quantization (#145140) 2025-01-19 10:20:00 +00:00
quantizer.py PEP585 update - torch/ao/quantization (#145140) 2025-01-19 10:20:00 +00:00
utils.py PEP585 update - torch/ao/quantization (#145140) 2025-01-19 10:20:00 +00:00
x86_inductor_quantizer.py PEP585 update - torch/ao/quantization (#145140) 2025-01-19 10:20:00 +00:00
xnnpack_quantizer_utils.py PEP585 update - torch/ao/quantization (#145140) 2025-01-19 10:20:00 +00:00
xnnpack_quantizer.py [WIP] Move XNNPACKQuantizer from PyTorch to ExecuTorch (#144940) 2025-01-24 10:06:07 +00:00
xpu_inductor_quantizer.py [Intel GPU] qconv_pointwise.binary XPU support (#135189) 2025-02-20 02:02:54 +00:00