Summary:
Since the tensor iterator supports the broadcast, we will just remove the assertion on input shapes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30442
Differential Revision: D19976562
Pulled By: lly-zero-one
fbshipit-source-id: 91b27fc8b2570f29d110c6df26eacdd16f587b9f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33080
Quantized batch norm for cases where batch norm cannot be fused with conv.
AVX2 implementation is from Caffe2.
Test Plan:
python test/test_quantized.py TestQuantizedOps.test_batch_norm
Imported from OSS
Differential Revision: D19861927
fbshipit-source-id: bd8cd101fc063cb6358132ab7c651a160999293c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32479
Run dynamic quantization on mobile (similar to FBGEMM). Currently only implemented on linear operator
Test Plan:
python test/test_quantized.py TestDynamicQuantizedLinear.test_qlinear
Imported from OSS
Differential Revision: D19542980
fbshipit-source-id: c9f6e5e8ded4d62ae0f2ed99e478c8307dde22ed
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30445
Create distributed and rpc directories under caffe/test for better management
of unit tests.
Differential Revision: D18702786
fbshipit-source-id: e9daeed0cfb846ef68806f6decfcb57c0e0e3606
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31031
This activation will be needed for the LSTM implementation.
Also includes the QNNPack implementation.
Test Plan: Imported from OSS
Differential Revision: D19334280
Pulled By: z-a-f
fbshipit-source-id: ae14399765a47afdf9b1e072d3967c24ff473e8d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31031
This activation will be needed for the LSTM implementation.
Also includes the QNNPack implementation.
Test Plan: Imported from OSS
Differential Revision: D18903453
Pulled By: z-a-f
fbshipit-source-id: 0050b1cebb1ddb179b7ecbcb114fe70705070f67
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30890
We've received way too many complaints about this functionality making tests flaky, and it's not providing value to us anyway. Let's cut the shit and kill deadline testing
Test Plan: Imported from OSS
Differential Revision: D18857597
Pulled By: jamesr66a
fbshipit-source-id: 67e3412795ef2fb7b7ee896169651084e434d2f6
Summary: Add a unit test for the Dynamic Quantized Linear operator (```torch.fbgemm_linear_quantize_weight```, ```torch.fbgemm_pack_quantized_matrix```, and ```torch.fbgemm_linear_int8_weight```) in ```test_quantized.py```.
Test Plan:
buck test mode/dev caffe2/test:quantized -- 'test_qlinear_legacy \(test_quantized\.TestDynamicQuantizedLinear\)' --print-passing-details
[jianyuhuang@devvm29567.prn1.facebook.com: ~/fbsource/fbcode/caffe2/test] $ buck test mode/dev caffe2/test:quantized -- 'test_dynamic_qlinear \(test_quantized\.TestQuantizedLinear\)' --print-passing-details
Parsing buck files: finished in 1.8 sec
Building: finished in 3.4 sec (100%) 6772/6772 jobs, 2 updated
Total time: 5.2 sec
Trace available for this run at /tmp/testpilot.20190714-220130.2698168.log
TestPilot test runner for Facebook. See https://fburl.com/testpilot for details.
Testpilot build revision 4f180136f799ab45ec2bf5d7644cb14955d4dd7a fbpkg
6c6253f255644ca3b8ce1bc5955b0f25 at Mon Jul 8 14:13:38 2019 by twsvcscm from /
usr/local/fbprojects/packages/testinfra.testpilot/651/t.par
Discovering tests
Running 1 tests
Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/1125900044862617
✓ caffe2/test:quantized - test_dynamic_qlinear (test_quantized.TestQuantizedLinear) 0.023 1/1
(passed)
Test output:
> test_dynamic_qlinear (test_quantized.TestQuantizedLinear) ... ok
>
> ----------------------------------------------------------------------
> Ran 1 test in 0.024s
>
> OK
Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/1125900044862617
Summary (total time 9.03s):
PASS: 1
FAIL: 0
SKIP: 0
FATAL: 0
TIMEOUT: 0
OMIT: 0
Differential Revision: D16404027
fbshipit-source-id: 4c85dd255637fd8b1eb4830e0464f48c22706f41
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29529
Pull Request resolved: https://github.com/pytorch/glow/pull/3771
We would like to replace `conv_prepack` with `conv2d_prepack` and `conv_unpack` with `conv2d_unpack`.
This makes the naming consistent between 2D and 3D conv:
```
torch.ops.quantized.conv2d_prepack
torch.ops.quantized.conv2d_unpack
torch.ops.quantized.conv2d
torch.ops.quantized.conv3d_prepack
torch.ops.quantized.conv3d_unpack
torch.ops.quantized.conv3d
```
We should do this earlier rather than later when we have more users for the quantized conv2d ops, for better engineering.
The replacement bash command is as the follows:
```
find ./ -type f -exec sed -i -e 's/quantized::conv_prepack/quantized::conv2d_prepack/g' {} \;
find ./ -type f -exec sed -i -e 's/quantized::conv_unpack/quantized::conv2d_unpack/g' {} \;
find ./ -type f -exec sed -i -e 's/torch.ops.quantized.conv_prepack/torch.ops.quantized.conv2d_prepack/g' {} \;
find ./ -type f -exec sed -i -e 's/torch.ops.quantized.conv_unpack/torch.ops.quantized.conv2d_unpack/g' {} \;
```
ghstack-source-id: 93661879
Test Plan: CI
Reviewed By: jackm321
Differential Revision: D18421079
fbshipit-source-id: 17ae8b1ee79223bd2c5d4bbccd57af6580c4ab12
Summary:
It is reported this test is flaky due to the time expiration. This pr flags it as no_deadline test.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29502
Differential Revision: D18416632
Pulled By: lly-zero-one
fbshipit-source-id: 27cd7b28139f3f16ee0cf5802a0709385719d487
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29328
Tests are flaky as seen in issue #29326.
Disable until we fix the kernels.
Test Plan:
python test/test_quantized.py TestQNNPackOps
Imported from OSS
Differential Revision: D18358200
fbshipit-source-id: 58f1981799fe8253234fcc7b0540e1c0b6babc15
Summary:
This is actually a bug in both testing and the average pool implementation.
In testing, we used the quantized value as float input and failed to padding the value with zero_point.
In op implementation, the size for averaging is not correct for padding case when count_include_pad is true.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28260
Differential Revision: D18039960
Pulled By: lly-zero-one
fbshipit-source-id: 7b5d34498b60f5d574a276a22798c9f576944734
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27675
This leverages QNNPACK global average pooling to perform torch.mean on input feature maps
Currently can only support mean along HxW plane in NCHW tensor.
Test Plan:
python test/test_quantized.py TestQuantizedOps.test_mean
Imported from OSS
Differential Revision: D17989336
fbshipit-source-id: 8d4cbcbed5f146290b1580d26e5b45359d293761
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28246
Updated the reference fp32 implementation to use the dequantized input tensor to correctly take padded values into account
Test Plan:
python test/test_quantized.py TestQNNPackOps.test_avg_pool2d
Imported from OSS
Differential Revision: D17989334
fbshipit-source-id: 848ce78713280f529f71ff48e930db8de18abc62
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27631
Add support to perform avg_pool2d on mobile. Tested using existing avg_pool2d python tests
Uses qnnpack backend, which currently only support 4 dim inputs.
Test Plan:
python test/test_quantized.py TestQNNPackOps.test_avg_pool2d
Imported from OSS
Differential Revision: D17973792
fbshipit-source-id: 95ffffb2da656ed911a618b9cb68d6b728c16c74
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27616
Fix a problem in reference implementation of equal
Test Plan:
pytho test/test_quantized.py
Imported from OSS
Differential Revision: D17837055
fbshipit-source-id: 1e4bc32f4334c0352468a61fa4316a1c0ff76485
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26992
Run the same test for FBGEMM and QNNPACK backends.
Checks that QNNPACK or FBGEMM are supported before running it (using supported_qengines)
Test Plan:
python test/test_quantized.py TestQuantizedLinear
python test/test_quantized.py TestQuantizedConv
python test/test_quantized_models.py
python test/test_quantized_nn_mods.py
Imported from OSS
Differential Revision: D17689171
fbshipit-source-id: e11c0a5e41f5f4e6836a614a5b61e4db3c5e384b
Summary:
The QuantizedAVx2 does not support the int32 type. We switch to use at::quantize_vec function instead.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26854
Differential Revision: D17609872
Pulled By: llyfacebook
fbshipit-source-id: b4a77d93ce0ebfef696506b5cdbe3e91fe44bb36
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26676
Just makes it more user-friendly to be able to pass any floating point or int point values to scales or zero_points for per-channel quantization. It matches behavior or per tensor quantizer where those arguments are scalars (not tensors) and thus automatic casting is applied.
Test Plan: Imported from OSS
Differential Revision: D17537051
Pulled By: dzhulgakov
fbshipit-source-id: e955ccdb5b4691828a559dc8f1ed7de54b6d12c4
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26675
Based on offline poll, we're very unlikely to have multi-axis quantized tensors in the foreseeable future. Let's simplify API and just return int instead of list. It also matches the singular `axis` name.
Test Plan: Imported from OSS
Differential Revision: D17537052
Pulled By: dzhulgakov
fbshipit-source-id: 676abc3b251d288468aaed467b5e5ca4063b98b0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26524
This creates an NHWC specialization for `quantized::cat` that kicks in when all inputs are `NHWC`. This ensures the correct layout is propagated downstream as well as is an optimized implementation specifically for this data layout
Benchmark script based on Squeezenet shapes:
```
import torch, time
torch.manual_seed(0)
# NHWC
sizes = [
(1, 54, 54, 64),
(1, 54, 54, 128),
(1, 26, 26, 128),
(1, 26, 26, 256),
(1, 12, 12, 256)
]
for size in sizes:
x = torch.rand(*size)
y = torch.rand(*size)
qX = torch.quantize_linear(x, 0.01, 3, torch.qint8).permute([0, 3, 1, 2])
qY = torch.quantize_linear(y, 0.01, 3, torch.qint8).permute([0, 3, 1, 2])
ref = torch.cat([qX.dequantize(), qY.dequantize()], dim=1)
NITER = 1000
s = time.time()
for i in range(NITER):
out = torch.ops.quantized.cat([qX, qY], dim=1, scale=0.01, zero_point=3)
time_per_iter = (time.time() - s) / NITER
print('time per iter ms', time_per_iter * 1000)
print('gb/s', (qX.numel() + qY.numel() + out.numel()) * qX.element_size() / time_per_iter / 1e9)
torch.testing.assert_allclose(out.dequantize(), ref)
```
Before this change
```
time per iter ms 0.6898486614227295
gb/s 1.0821156026605054
time per iter ms 1.5480577945709229
gb/s 0.9644291093239284
time per iter ms 0.3180875778198242
gb/s 1.0881028500775023
time per iter ms 0.6702737808227539
gb/s 1.032748139350315
time per iter ms 0.13010454177856445
gb/s 1.1333655073392244
```
After this change
```
time per iter ms 0.11604785919189453
gb/s 6.432656364350577
time per iter ms 0.15956878662109375
gb/s 9.356416324360508
time per iter ms 0.040181636810302734
gb/s 8.613685939027139
time per iter ms 0.06564664840698242
gb/s 10.544696748392909
time per iter ms 0.018549680709838867
gb/s 7.949247337814738
```
Test Plan: Imported from OSS
Differential Revision: D17503593
Pulled By: jamesr66a
fbshipit-source-id: ec5d57ad8fbcb3fd9379e8bd370abd29d386f953
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26586
Use the backend engine flag to call QNNPACK for quantized ops.
Test Plan: python test/test_quantized.py TestQNNPACKOps
Differential Revision: D17515129
Pulled By: supriyar
fbshipit-source-id: 951e90205aa19581ea006a91d9514fc7a94409ef
Summary:
In this PR, we tried to fix the windows build issue of d17437015.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26580
Differential Revision: D17517341
Pulled By: llyfacebook
fbshipit-source-id: db726596aa8f7c992c5a7ddc2781dc3aa0312284
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26599
These fail due to tolerance in equality comparison. Disable them for now.
ghstack-source-id: 90553855
Test Plan: unit tests
Differential Revision: D17517085
fbshipit-source-id: a4d9278e356318719ccd84047404915a97944f52
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26575
To keep consistent with `quantize_per_tensor` we also
rename `quantize_linear_per_channel` to `quantize_per_channel`
Test Plan:
ci
Imported from OSS
Differential Revision: D17517360
fbshipit-source-id: 3af7d8f0fbe99148b79fcb1ad2fe811f776590cd
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26574
Since we also have `quantized::linear`, `quantize_linear` sounds
confusing, so we plan to rename it before the branch cut
Test Plan:
ci
Imported from OSS
Differential Revision: D17514876
fbshipit-source-id: 01d9005e6ec8cb9950b9d8bba122109c389641d3