Commit Graph

81 Commits

Author SHA1 Message Date
Lingyi Liu
ba8002ec13 Quantized Interpolate Kernel(upsample_nearest2d) (#26617)
Summary:
In this PR, we implemented the support of quantized interpolate with upsample_nearest2d case.

import torch, time

for dtype in [torch.qint8, torch.quint8, torch.qint32]:
    print('****', str(dtype), '*****')
    x = torch.rand(1, 56, 56, 256)

    q_x = torch.quantize_per_tensor(x, 0.5, 1, dtype)
    q_x = q_x.permute([0, 3, 1, 2])

    x = x.permute([0, 3, 1, 2])

    NITER = 100

    s = time.time()
    for i in range(NITER):
        # float_out = torch.nn.functional.avg_pool2d(x, kernel_size=5, stride=None, padding=0)
        # float_out = torch.nn.functional.adaptive_avg_pool2d(x, output_size=5)
        float_out = torch.nn.functional.interpolate(x, size=5, scale_factor=None, mode="nearest", align_corners=None)
    time_per_iter_float = (time.time() - s) / NITER

    s = time.time()
    for i in range(NITER):
        # quant_out = torch.nn.quantized.functional.avg_pool2d(q_x, kernel_size=5, stride=None, padding=0)
        # quant_out = torch.nn.quantized.functional.adaptive_avg_pool2d(q_x, output_size=5)
        quant_out = torch.nn.quantized.functional.interpolate(q_x, size=5, scale_factor=None, mode="nearest", align_corners=None)
    time_per_iter_quant = (time.time() - s) / NITER

    ref_quantized = torch.quantize_per_tensor(float_out, 0.5, 1, dtype)
    #  torch.testing.assert_allclose(ref_quantized.dequantize(), quant_out.dequantize())

    print('time/iter ms (float)', 'time/iter ms (quant)', 'quant/float', sep='\t')
    print(time_per_iter_float * 1000, time_per_iter_quant * 1000, time_per_iter_quant / time_per_iter_float, sep='\t')

    bytes_float = (x.numel() + float_out.numel()) * x.element_size()
    bytes_quant = (q_x.numel() + quant_out.numel()) * q_x.element_size()

    float_bw_gbps = bytes_float / time_per_iter_float / 1e9
    quant_bw_gbps = bytes_quant / time_per_iter_quant / 1e9

    print('GB/s float', 'GB/s quant', sep='\t')
    print(float_bw_gbps, quant_bw_gbps, sep='\t')

=========without special handling of NHWC layout=============
**** torch.qint8 *****
time/iter ms (float)    time/iter ms (quant)    quant/float
2.08712100982666        2.1624231338500977      1.0360794240817361
GB/s float      GB/s quant
1.5508750976872339      0.37421723220248165
**** torch.quint8 *****
time/iter ms (float)    time/iter ms (quant)    quant/float
2.056601047515869       2.184889316558838       1.0623787823107091
GB/s float      GB/s quant
1.573890086222483       0.3703693335250963
**** torch.qint32 *****
time/iter ms (float)    time/iter ms (quant)    quant/float
2.0152783393859863      2.067704200744629       1.0260142037623525
GB/s float      GB/s quant
1.6061622539873104      1.5654386148823074

=========with special handling of NHWC layout=============
**** torch.qint8 *****
time/iter ms (float)    time/iter ms (quant)    quant/float
2.044649124145508       0.009250640869140625    0.004524317038018256
GB/s float      GB/s quant
1.5830902044636819      87.47675014597938
**** torch.quint8 *****
time/iter ms (float)    time/iter ms (quant)    quant/float
2.049403190612793       0.009107589721679688    0.004444020465761265
GB/s float      GB/s quant
1.579417859221808       88.8507305147644
**** torch.qint32 *****
time/iter ms (float)    time/iter ms (quant)    quant/float
2.0601415634155273      0.01062631607055664     0.0051580513976618066
GB/s float      GB/s quant
1.5711852318699757      304.6082930818039
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26617

Differential Revision: D17519146

Pulled By: llyfacebook

fbshipit-source-id: 126876e550ef7009fd75f5ccc033599f1f37456d
2019-09-23 20:32:19 -07:00
Lingyi Liu
eca01eb0a6 quantized average_pool2d and adaptive_avg_pool2d implementation(Revert d17437015) (#26580)
Summary:
In this PR, we tried to fix the windows build issue of  d17437015.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26580

Differential Revision: D17517341

Pulled By: llyfacebook

fbshipit-source-id: db726596aa8f7c992c5a7ddc2781dc3aa0312284
2019-09-21 11:10:26 -07:00
Jerry Zhang
254122dd4e quantize_linear -> quantize_per_tensor (#26574)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26574

Since we also have `quantized::linear`, `quantize_linear` sounds
confusing, so we plan to rename it before the branch cut

Test Plan:
ci

Imported from OSS

Differential Revision: D17514876

fbshipit-source-id: 01d9005e6ec8cb9950b9d8bba122109c389641d3
2019-09-20 21:58:48 -07:00
Lingyi Liu
f0b7132b87 Revert D17437015: [pytorch][PR] Add the quantized average_pool2d support and adaptive_avg_pool2d support
Test Plan: revert-hammer

Differential Revision:
D17437015

Original commit changeset: 496aed1e4171

fbshipit-source-id: 53e22a85e06bd9d7827579b124b7f136230b6c1d
2019-09-20 15:01:49 -07:00
Lingyi Liu
6411b92d6e Add the quantized average_pool2d support and adaptive_avg_pool2d support (#25899)
Summary:
//copied from PR https://github.com/pytorch/pytorch/issues/25676

===============For avg_pool2d==============

import torch, time

for dtype in [torch.qint8, torch.quint8, torch.qint32]:
    print('****', str(dtype), '*****')
    x = torch.rand(1, 56, 56, 256)

    q_x = torch.quantize_linear(x, 0.5, 1, dtype)
    q_x = q_x.permute([0, 3, 1, 2])

    x = x.permute([0, 3, 1, 2])

    NITER = 100

    s = time.time()
    for i in range(NITER):
        float_out = torch.nn.functional.avg_pool2d(x, kernel_size=3, stride=None, padding=0)
    time_per_iter_float = (time.time() - s) / NITER

    s = time.time()
    for i in range(NITER):
        quant_out = torch.nn.quantized.functional.avg_pool2d(q_x, kernel_size=3, stride=None, padding=0)
    time_per_iter_quant = (time.time() - s) / NITER

    ref_quantized = torch.quantize_linear(float_out, 0.5, 1, dtype)
    torch.testing.assert_allclose(ref_quantized.dequantize(), quant_out.dequantize())

    print('time/iter ms (float)', 'time/iter ms (quant)', 'quant/float', sep='\t')
    print(time_per_iter_float * 1000, time_per_iter_quant * 1000, time_per_iter_quant / time_per_iter_float, sep='\t')

    bytes_float = (x.numel() + float_out.numel()) * x.element_size()
    bytes_quant = (q_x.numel() + quant_out.numel()) * q_x.element_size()

    float_bw_gbps = bytes_float / time_per_iter_float / 1e9
    quant_bw_gbps = bytes_quant / time_per_iter_quant / 1e9

    print('GB/s float', 'GB/s quant', sep='\t')
    print(float_bw_gbps, quant_bw_gbps, sep='\t')

Before the vectorization:
**** torch.qint8 *****
time/iter ms (float)    time/iter ms (quant)    quant/float
2.67439603805542        7.126874923706055       2.6648539791017924
GB/s float      GB/s quant
1.2470733401269298      0.11699265230915809
**** torch.quint8 *****
time/iter ms (float)    time/iter ms (quant)    quant/float
2.587001323699951       7.011299133300781       2.7102031487456535
GB/s float      GB/s quant
1.2892022781148076      0.11892118481150399
**** torch.qint32 *****
time/iter ms (float)    time/iter ms (quant)    quant/float
2.6659250259399414      7.03080415725708        2.637285028215745
GB/s float      GB/s quant
1.2510359321992184      0.4743650833393638

After the vectorization
**** torch.qint8 *****
time/iter ms (float)    time/iter ms (quant)    quant/float
2.6113319396972656      0.5631613731384277      0.2156605847679846
GB/s float      GB/s quant
1.2771903676047593      1.48055608884072
**** torch.quint8 *****
time/iter ms (float)    time/iter ms (quant)    quant/float
2.5221967697143555      0.5518221855163574      0.21878633425529784
GB/s float      GB/s quant
1.322326647963202       1.5109794819499591
**** torch.qint32 *****
time/iter ms (float)    time/iter ms (quant)    quant/float
2.5173258781433105      4.0132904052734375      1.5942673295177407
GB/s float      GB/s quant
1.324885279636461       0.8310308159154421

===============For adaptive_avg_pool2d==============
import torch, time

for dtype in [torch.qint8, torch.quint8, torch.qint32]:
    print('****', str(dtype), '*****')
    x = torch.rand(1, 56, 56, 256)

    q_x = torch.quantize_linear(x, 0.5, 1, dtype)
    q_x = q_x.permute([0, 3, 1, 2])

    x = x.permute([0, 3, 1, 2])

    NITER = 100

    s = time.time()
    for i in range(NITER):
        float_out = torch.nn.functional.adaptive_avg_pool2d(x, output_size=5)
    time_per_iter_float = (time.time() - s) / NITER

    s = time.time()
    for i in range(NITER):
        quant_out = torch.nn.quantized.functional.adaptive_avg_pool2d(q_x, output_size=5)
    time_per_iter_quant = (time.time() - s) / NITER

    ref_quantized = torch.quantize_linear(float_out, 0.5, 1, dtype)
    torch.testing.assert_allclose(ref_quantized.dequantize(), quant_out.dequantize())

    print('time/iter ms (float)', 'time/iter ms (quant)', 'quant/float', sep='\t')
    print(time_per_iter_float * 1000, time_per_iter_quant * 1000, time_per_iter_quant / time_per_iter_float, sep='\t')

    bytes_float = (x.numel() + float_out.numel()) * x.element_size()
    bytes_quant = (q_x.numel() + quant_out.numel()) * q_x.element_size()

    float_bw_gbps = bytes_float / time_per_iter_float / 1e9
    quant_bw_gbps = bytes_quant / time_per_iter_quant / 1e9

    print('GB/s float', 'GB/s quant', sep='\t')
    print(float_bw_gbps, quant_bw_gbps, sep='\t')
~
//Before the vectorization
**** torch.qint8 *****
time/iter ms (float)    time/iter ms (quant)    quant/float
2.286238670349121       4.600362777709961       2.0121970804594342
GB/s float      GB/s quant
1.4158031888707898      0.17590264922602994
**** torch.quint8 *****
time/iter ms (float)    time/iter ms (quant)    quant/float
2.2867274284362793      4.474163055419922       1.9565790831832832
GB/s float      GB/s quant
1.4155005794518536      0.180864217503144
**** torch.qint32 *****
time/iter ms (float)    time/iter ms (quant)    quant/float
2.3176145553588867      4.264359474182129       1.8399778618588218
GB/s float      GB/s quant
1.3966360335956578      0.7590504551966285

//After the vectorization:
**** torch.qint8 *****
time/iter ms (float)    time/iter ms (quant)    quant/float
2.3224568367004395      0.23195743560791016     0.09987588657942796
GB/s float      GB/s quant
1.3937240722194333      3.4886400510473843
**** torch.quint8 *****
time/iter ms (float)    time/iter ms (quant)    quant/float
2.255082130432129       0.2124309539794922      0.09420098324258604
GB/s float      GB/s quant
1.435364129899667       3.8093130254365883
**** torch.qint32 *****
time/iter ms (float)    time/iter ms (quant)    quant/float
2.266514301300049       1.6029787063598633      0.7072440290539581
GB/s float      GB/s quant
1.4281242338260862      2.0192807222938463
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25899

Differential Revision: D17437015

Pulled By: llyfacebook

fbshipit-source-id: 496aed1e41711048d0853254d6819d3fb141a0c0
2019-09-20 14:20:16 -07:00
Dmytro Dzhulgakov
af64789cfa Fold activation permutation inside quantized conv operator (#26242)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26242

According to https://github.com/pytorch/pytorch/issues/19092 we always keep NCHW order and do handling inside the kernels. This PR fixes it for activations of the qconv by using MemoryLayout mechanism - activations stay logically as NCHW but strided as NHWC.

Note, that this version is more aggressive than eventual MemoryLayout mechanism - the QConv's output is always NHWC regardless of the input striding. I think it's ok as we don't have NCHW quantized kernels anyway - so the very first conv would magically switch the order, but I'm open to suggestions. Btw, it doesn't change behavior - same happens today in master because of the explicit permute() call.

Test Plan: Imported from OSS

Differential Revision: D17443218

Pulled By: dzhulgakov

fbshipit-source-id: cfd136ae0465acd8d8c26ffad87385dac9c88726
2019-09-19 13:39:26 -07:00
Dmytro Dzhulgakov
d5daac7223 Fold weight permutation inside quantized conv operator (#26241)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26241

According to https://github.com/pytorch/pytorch/issues/19092 we always keep NCHW order and do handling inside the kernels. This PR fixes it for weights of the qconv by using MemoryLayout mechanism.

Test Plan: Imported from OSS

Differential Revision: D17443219

Pulled By: dzhulgakov

fbshipit-source-id: ce0eb92034a9977b3303dafab8b0414575171062
2019-09-19 13:39:22 -07:00
Daya Khudia
2b52c1d982 Dynamic quantization for bias. (#26057)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26057

bias is now unquantized (i.e. floating type) for qconv and qlinear. It is dynamically quantized by fbgemm.

TODO: Add some performance numbers.

Tests:

test:quantization
```
Summary (total time 8.41s):
  PASS: 24
  FAIL: 0
  SKIP: 0
  FATAL: 0
  TIMEOUT: 0More details at https://our.intern.facebook.com/intern/buck/build/74d5f6f7-55c9-4350-a618-2013042fffd8

  OMIT: 0
```

test:quantized
```
Summary (total time 13.21s):
  PASS: 43
  FAIL: 0
  SKIP: 5
    caffe2/test:quantized - test_qnnpack_maxpool2d (test_quantized.TestQNNPackOps)
    caffe2/test:quantized - test_compare_tensor_scalar (test_quantized.TestComparatorOps)
    caffe2/test:quantized - test_qnnpack_linear (test_quantized.TestQNNPackOps)
    caffe2/test:quantized - test_qnnpack_relu (test_quantized.TestQNNPackOps)
    caffe2/test:quantized - test_qnnpack_add (test_quantized.TestQNNPackOps)
  FATAL: 0
  TIMEOUT: 0
  OMIT: 0
```
ghstack-source-id: 90166254

Test Plan:
buck test mode/dev caffe2/test:quantization

buck test mode/dev caffe2/test:quantized

Differential Revision: D17328028

fbshipit-source-id: d4a163d730d0f4a03e8e0faf7420710cf36eec09
2019-09-16 14:43:06 -07:00
Supriya Rao
c60dddbb9f Store bias in PackedConvWeight in fbgemm (#25626)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25626

Add bias as an optional parameter in the packed conv weight struct.
ghstack-source-id: 89780639

Test Plan: python test/run_test.py --exclude nn --verbose --bring-to-front quantization quantized quantized_tensor quantized_nn_mods quantizer

Reviewed By: raghuramank100

Differential Revision: D17177723

fbshipit-source-id: e502f2196cb1c002db8b691124db740368944c92
2019-09-10 08:43:55 -07:00
Supriya Rao
9d2d31e626 Store bias in PackedLinearWeight struct in fbgemm (#25428)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25428

Added bias as an optional param to the quantized_linear_prepack function.
Bias is quantized during runtime using input scale and weight scale.
ghstack-source-id: 89601399

Test Plan: python test/run_test.py --exclude nn --verbose --bring-to-front quantization quantized quantized_tensor quantized_nn_mods quantizer

Differential Revision: D17121304

fbshipit-source-id: 8adb0e55e4aed0a5430aaa2c8639c8ad1639c85a
2019-09-06 08:37:34 -07:00
Supriya Rao
61819260f7 Rename FBGEMM quantized operators to generic quantized ops (#25678)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25678

As an effort to unify fbgemm and qnnpack at the dispatcher level, we need to have a generic name for the quantized backed ops.
Currently FBGEMM is guarded by the USE_FBGEMM macro and QNNPACK uses USE_QNNPACK.
ghstack-source-id: 89518961

Test Plan: buck test caffe2/test:quantized

Differential Revision: D17194364

fbshipit-source-id: 5960aedff6b8cb89eb3872c39b74caf54c0fbf20
2019-09-05 10:13:08 -07:00
Edward Yang
55da02a86d Revert D17097735: [quantization] Rename fbgemm quantized operators to generic quantized ops
Test Plan: revert-hammer

Differential Revision:
D17097735

Original commit changeset: 447112a7a421

fbshipit-source-id: 78368b6f84d96cea70692fb000cebe99602a08c1
2019-09-04 15:02:32 -07:00
Supriya Rao
c9ba5186d3 Rename fbgemm quantized operators to generic quantized ops (#25338)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25338

As an effort to unify fbgemm and qnnpack at the dispatcher level, we need to have a generic name for the quantized backed ops.
Currently FBGEMM is guarded by the USE_FBGEMM macro and QNNPACK uses USE_QNNPACK.

TBD: Use compile time macro or run_time to switch between fbgemm and qnnpack.
ghstack-source-id: 89454244

Test Plan: buck test caffe2/test:quantized

Differential Revision: D17097735

fbshipit-source-id: 447112a7a421387724d3e29b8fd8412dfb1c373a
2019-09-04 14:27:27 -07:00
Raghuraman Krishnamoorthi
9945c0cea6 Work around for bias quantization for conv and linear operators (#25212)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25212

In eager mode, all modules need to work with input tensors that can change qparams dynamically. This issue https://github.com/pytorch/pytorch/issues/23874 will address this via FBGEMM modifications. This is a work around before that.
ghstack-source-id: 89118038

Test Plan:
buck test caffe2/test:quantized -- 'test_conv_api \(test_quantized_nn_mods\.ModuleAPITest\)' --print-passing-details
Summary (total time 65.86s):
  PASS: 1
  FAIL: 0
  SKIP: 0
  FATAL: 0
  TIMEOUT: 0
  OMIT: 0

Differential Revision: D17064471

fbshipit-source-id: 3c192442b19bf2d9d88d4e52de6c24dc134a846f
2019-08-28 07:24:03 -07:00
Raghuraman Krishnamoorthi
26a438d4fb Revert D16852280: Work around for bias quantization for conv and linear operators
Test Plan: revert-hammer

Differential Revision:
D16852280

Original commit changeset: 988f8ff91616

fbshipit-source-id: e2cf03e13dc8dcf0db22d43740d72fd8b069fd74
2019-08-26 16:25:33 -07:00
Raghuraman Krishnamoorthi
ea601d90d6 Work around for bias quantization for conv and linear operators (#24789)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24789

In eager mode, all modules need to work with input tensors that can change qparams dynamically. This issue https://github.com/pytorch/pytorch/issues/23874 will address this via FBGEMM modifications. This is a work around before that.
ghstack-source-id: 89003798

Test Plan:
buck test caffe2/test:quantized -- 'test_conv_api \(test_quantized_nn_mods\.ModuleAPITest\)' --print-passing-details
Summary (total time 65.86s):
  PASS: 1
  FAIL: 0
  SKIP: 0
  FATAL: 0
  TIMEOUT: 0
  OMIT: 0

Differential Revision: D16852280

fbshipit-source-id: 988f8ff91616eddf511e71926aa7d2d0f1938188
2019-08-26 12:16:42 -07:00
Zafar Takhirov
dd97743de7 Enables inplace in the quantized relu (#24374)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24374

This is a duplicate to bring back #23704 with diff revision D16634539

Test Plan: Imported from OSS

Differential Revision: D16818664

Pulled By: zafartahirov

fbshipit-source-id: c8f7965356555a6a995eaeea6820ea62cbbea6fd
2019-08-16 16:53:09 -07:00
Edward Yang
ce79d5135a Revert D16634539: Enabling inline in quantized relu
Differential Revision:
D16634539

Original commit changeset: 84266f92049c

fbshipit-source-id: 5e1d8e3560483600a61c2ac62b13e9c3fede8301
2019-08-09 08:33:39 -07:00
Zafar Takhirov
9558ccdd76 Enabling inline in quantized relu
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23704

Test Plan: Imported from OSS

Differential Revision: D16634539

Pulled By: zafartahirov

fbshipit-source-id: 84266f92049ce4410ec25821b8d4699a9e3f123e
2019-08-09 02:37:12 -07:00
James Reed
3314d60a75 fix conv2d
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23690

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D16610734

Pulled By: jamesr66a

fbshipit-source-id: e190174f11d1810e6f87e2df256543028e9154ef
2019-08-01 19:39:08 -07:00
Zafar Takhirov
5e4c24baef Documentation cleanup
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23148

Test Plan: Imported from OSS

Differential Revision: D16414202

Pulled By: zafartahirov

fbshipit-source-id: a999be0384a2ff5272dd2f8adcf87547ce6ee9dd
2019-07-31 11:30:44 -07:00
Zafar Takhirov
5b4ac841c9 Quantized Average Pool kernel
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23143

Test Plan: Imported from OSS

Differential Revision: D16406281

Pulled By: zafartahirov

fbshipit-source-id: dcd8b58a0ef32b3dcc3337c282c59b4e52091516
2019-07-30 10:51:25 -07:00
Daya Khudia
6a8c2758d5 Add better performing versions for groupwise and depthwise convolutions (#22869)
Summary:
Groupwise and depthwise convolutions become faster with this diff
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22869

Test Plan:
buck test mode/dev caffe2/test:quantized -- 'test_qconv'  --print-passing-details

```
Running 2 tests
Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/562950091484224
      ✓ caffe2/test:quantized - test_qconv (test_quantized.TestQuantizedConv) 2.731 1/2 (passed)
Test output:
> test_qconv (test_quantized.TestQuantizedConv) ... ok
>
> ----------------------------------------------------------------------
> Ran 1 test in 2.732s
>
> OK
      ✓ caffe2/test:quantized - test_qconv_unpack (test_quantized.TestQuantizedConv) 5.187 2/2 (passed)
Test output:
> test_qconv_unpack (test_quantized.TestQuantizedConv) ... ok
>
> ----------------------------------------------------------------------
> Ran 1 test in 5.188s
>
> OK
Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/562950091484224
Summary (total time 15.66s):
  PASS: 2
  FAIL: 0
  SKIP: 0
  FATAL: 0
  TIMEOUT: 0
  OMIT: 0

```

buck test mode/dev caffe2/test:quantized -- 'test_conv_api'
```
Running 2 tests
Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/3940649676010406
      ✓ caffe2/test:quantized - test_conv_api (test_nn_quantized.ModuleAPITest) 0.040 1/2 (passed)
      ✓ caffe2/test:quantized - test_conv_api (test_quantized_conv.FunctionalAPITest) 5.402 2/2 (passed)
Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/3940649676010406
Summary (total time 11.83s):
  PASS: 2
  FAIL: 0
  SKIP: 0
  FATAL: 0
  TIMEOUT: 0
  OMIT: 0
```

Differential Revision: D16264144

Pulled By: dskhudia

fbshipit-source-id: 32fa43e5c3d97c8aaa6e0858327a2ac0aef8df5c
2019-07-25 17:55:43 -07:00
Zafar Takhirov
94711d7471 Quantized conv avoid functional usage (#22733)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22733

This refactor changes the conv module to avoid the usage of the functional ops.

Reviewed By: jerryzh168

Differential Revision: D15835572

fbshipit-source-id: f2294cd708fbe8372eb3a15cc60d83777d4f7029
2019-07-24 11:43:12 -07:00
Jerry Zhang
d7448c7812 quantized conv module (#23178)
Summary:
att

Pull Request resolved: https://github.com/pytorch/pytorch/pull/23178
ghstack-source-id: 86973164

Differential Revision: D16426871

fbshipit-source-id: a2ebb38997acfeb61b7dfd6b11dd8ee9b3a7a8ed
2019-07-22 20:47:40 -07:00
Zafar Takhirov
963707c5ea MaxPool2d in the torch (#22765)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22765

the pooling signature is the same as the non-quantized one. Adding it to the native_functions.yaml

Reviewed By: jerryzh168

Differential Revision: D16102608

fbshipit-source-id: 7627ad8f02a231f488b74d1a245b853f89d9c419
2019-07-20 21:41:09 -07:00
Zafar Takhirov
d21e476dcd Quantized Conv2d Module (#21323)
Summary:
Stack:
      https://github.com/pytorch/pytorch/issues/21808 Quantized conv avoid functional usage  [💛](https://our.intern.facebook.com/intern/diff/D15835572/)
      **https://github.com/pytorch/pytorch/issues/21323 Quantized Conv2d Module**  [💛](https://our.intern.facebook.com/intern/diff/D15551835/)

Quantized Conv2d Module
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21323

Test Plan:
Tests are split into two parts: functional and API.

`buck test mode/dev caffe2/test:quantized -- test_conv_api` : https://our.intern.facebook.com/intern/testinfra/testrun/4785074605318491

```
Parsing buck files: finished in 1.4 sec
Building: finished in 4.6 sec (100%) 7136/7136 jobs, 2 updated
  Total time: 6.1 sec
Trace available for this run at /tmp/testpilot.20190703-153023.392592.log
TestPilot test runner for Facebook. See https://fburl.com/testpilot for details.
Testpilot build revision 7149de230b9e1cdc7a872bb31fe099f0616dee09 fbpkg e59e6ab0fe8e47a496f915d34555c3ad at Fri Jun 28 12:20:54 2019 by twsvcscm from /usr/local/fbprojects/packages/testinfra.testpilot/647/t.par
Discovering tests
Running 2 tests
Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/4785074605318491
      ✓ caffe2/test:quantized - test_conv_api (test_nn_quantized.ModuleAPITest) 0.044 1/2 (passed)
      ✓ caffe2/test:quantized - test_conv_api (test_quantized_conv.FunctionalAPITest) 5.109 2/2 (passed)
Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/4785074605318491
Summary (total time 9.08s):
  PASS: 2
  FAIL: 0
  SKIP: 0
  FATAL: 0
  TIMEOUT: 0
  OMIT: 0
```

Differential Revision: D15551835

Pulled By: zafartahirov

fbshipit-source-id: 481a7df4b8a88e485437e1596eefb08d5e6766fa
2019-07-10 21:31:24 -07:00
zaf
e9d1b852c4 Functional conv2d (#21225)
Summary:
Stack:
      https://github.com/pytorch/pytorch/issues/21323 Quantized Conv2d Module  [💛](https://our.intern.facebook.com/intern/diff/D15551835/)
      **https://github.com/pytorch/pytorch/issues/21225 Functional conv2d**  [💛](https://our.intern.facebook.com/intern/diff/D15544061/)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/21225
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21225

Test Plan:
`buck test mode/dev caffe2/test:quantized -- test_conv_api`: https://our.intern.facebook.com/intern/testinfra/testrun/1407375016833929

```
Action graph will be rebuilt because files have been added or removed.
Parsing buck files: finished in 1.1 sec
Building: finished in 5.1 sec (100%) 6958/6958 jobs, 2 updated
  Total time: 6.3 sec
Trace available for this run at /tmp/testpilot.20190603-163323.4026295.log
TestPilot test runner for Facebook. See https://fburl.com/testpilot for details.
Testpilot build revision 17661db57af88ec71497f5c21efa86531c07662b fbpkg ce57c6c1c73f45c4aa890e9df65820c3 at Sat Jun  1 17:06:32 2019 by twsvcscm from /usr/local/fbprojects/packages/testinfra.testpilot/625/t.par
Discovering tests
Running 1 tests
Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/1407375016833929
      ✓ caffe2/test:quantized - test_conv_api (test_quantized_conv.FunctionalAPITest) 6.962 1/1 (passed)
Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/1407375016833929
Summary (total time 10.65s):
  PASS: 1
  FAIL: 0
  SKIP: 0
  FATAL: 0
  TIMEOUT: 0
  OMIT: 0
```

Reviewed By: dskhudia

Differential Revision: D15544061

Pulled By: zafartahirov

fbshipit-source-id: 700c0c78b5915bf7e54bda7c44f44b7b1e247f4d
2019-06-27 09:19:54 -07:00
Jerry Zhang
74375299e0 add torch.nn._intrinsic and torch.nn._intrinsic.quantized namespace (#20940)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20940

- `torch.nn._intrinsic` will contain normal(unquantized) fused modules like Conv2DRelu, Conv2DBnRelu, FakeQuantize ops etc.
- `torch.nn._intrinsic` will contain fused and quantized modules like Quantized Conv2DRelu, Quantized LinearRelu etc.
Right now I only added FakeQuantize op in `torch.nn._intrinsic` namespace, we'll have more later

Differential Revision: D15505228

fbshipit-source-id: d380929e38af7a5bcfbea27474d5b80f95d43b03
2019-05-29 14:09:37 -07:00
Jerry Zhang
abb3698976 Add QInt32 ScalarType and qint32 data type (#19816)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19816

We need this for quantization for bias
add third argument of ScalarType to `quantize_linear`

Differential Revision: D15094174

fbshipit-source-id: f19ec8f4716cf5fe0aa21b38d45af6d27c9ab377
2019-05-15 18:50:18 -07:00
Jerry Zhang
8ca10d35e5 Add torch.nn.quantized.functional namespace (#20042)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20042

Exposing torch.ops.quantized as torch.nn.quantized.functional

Differential Revision: D15178099

fbshipit-source-id: 8d65134bd727296f2750bbd2b54df0b99fc84b33
2019-05-06 18:49:58 -07:00