Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19118
A bug introduced by D14700576 reported by Yufei (fixed by D14778810 and D14785256) was not detected by our units tests.
This diff improves unit tests to catch such errors (with this diff and without D14778810, we can reproduce the bug Yufei reported).
This improvement also revealed a bug that affects the accuracy when we pre-pack weight and bias together and the pre-packed weight/bias are used by multiple nets. We were modifying the pre-packed bias in-place which was supposed to be constants.
Reviewed By: csummersea
Differential Revision: D14806077
fbshipit-source-id: aa9049c74b6ea98d21fbd097de306447a662a46d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15758
DNNLOWP Conv operators became very complex due to many options. This diff simplifies them by not allowing fp32 in/out. This is OK for Conv operators because Conv operators are usually used in deep networks where quantizing and dequantizing using separate operators is not much overhead.
Reviewed By: csummersea
Differential Revision: D13587341
fbshipit-source-id: e88c919dae79d1c5b7d787ea539edf5bcb064afc
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15588
Use NHWC2NCHW or NCHW2NHWC functions which is easier to understand compared to code using transpose and generalizable to non-2D convolutions.
Reviewed By: csummersea
Differential Revision: D13557674
fbshipit-source-id: c4fdb8850503ea58f6b17b188513ae2b29691ec0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14881
This diff allows us to pre-quantize and pre-pack weight matrix used in DNNLOWP_ACC16 .
The intended use pattern is run Int8ConvPackWeight in init_net that generates a packed weight and Int8Conv with DNNLOWP_ACC16 engine uses the the packed weight.
Reviewed By: csummersea
Differential Revision: D13374662
fbshipit-source-id: dd02b9a4eb7af1fe208aa857fcd0b445e6e395af
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14350
acc32 for now. Will have a separate diff for acc16 but that will need another out processing that does sparse convolution without im2col.
Reviewed By: dskhudia
Differential Revision: D13188595
fbshipit-source-id: e8faee46c7ea43e4a600aecb8b8e93e6c860a8c8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13660
Any change in server side quantized operator was triggering ios-sanity-check with more than 5 hours testing time. I suspect this was because the operator code was synced with xplat directory. This diff moves server side quantized operators to caffe2/caffe2/quantization/server to avoid this issue.
Reviewed By: hx89
Differential Revision: D12955420
fbshipit-source-id: b6c824b9de5e2a696f8c748e1b2c77d81d46746b