Commit Graph

63 Commits

Author SHA1 Message Date
James Reed
79710604cc fix lint
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24375

Test Plan: Imported from OSS

Differential Revision: D16819647

Pulled By: jamesr66a

fbshipit-source-id: 84eefe1ee27bd05ed9b8745d8011dddf6cb3ddbf
2019-08-14 17:37:39 -07:00
Jianyu Huang
53fbfd8fe8 Fix the dimension mismatch issues when running the BERT model (#23330)
Summary:
We found the following dimension mismatch issues when running the BERT model with the dynamic quantization:
```
Traceback (most recent call last):
  File "bert.py", line 75, in <module>
    outputs = model(tokens_tensor, token_type_ids=segments_tensors)
  File "/home/jianyuhuang/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 539, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/jianyuhuang/anaconda3/lib/python3.7/site-packages/pytorch_transformers/modeling_bert.py", line 709, in forward
    head_mask=head_mask)
  File "/home/jianyuhuang/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 539, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/jianyuhuang/anaconda3/lib/python3.7/site-packages/pytorch_transformers/modeling_bert.py", line 437, in forward
    layer_outputs = layer_module(hidden_states, attention_mask, head_mask[i])
  File "/home/jianyuhuang/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 539, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/jianyuhuang/anaconda3/lib/python3.7/site-packages/pytorch_transformers/modeling_bert.py", line 415, in forward
    attention_outputs = self.attention(hidden_states, attention_mask, head_mask)
  File "/home/jianyuhuang/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 539, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/jianyuhuang/anaconda3/lib/python3.7/site-packages/pytorch_transformers/modeling_bert.py", line 372, in forward
    self_outputs = self.self(input_tensor, attention_mask, head_mask)
  File "/home/jianyuhuang/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 539, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/jianyuhuang/anaconda3/lib/python3.7/site-packages/pytorch_transformers/modeling_bert.py", line 303, in forward
    query_layer = self.transpose_for_scores(mixed_query_layer)
  File "/home/jianyuhuang/anaconda3/lib/python3.7/site-packages/pytorch_transformers/modeling_bert.py", line 296, in transpose_for_scores
    return x.permute(0, 2, 1, 3)
RuntimeError: number of dims don't match in permute
```

Before the quantization, the dimension of ```x``` in ```transpose_for_scores``` is ```[1, 14, 12, 64]```;
After the quantization, the dimension of ```x``` in ```transpose_for_scores``` is ```[14, 12, 64]```.

There is a dimension mismatch on the output of the ```torch.ops.quantized.fbgemm_linear_dynamic``` operators. The first dimension is missing, which cause the issues with the abvove permute.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/23330
ghstack-source-id: 88287092

Differential Revision: D16463334

fbshipit-source-id: 4bdb836d1df31ba7c0bd44e3339aabdc8b943ae1
2019-08-14 14:20:50 -07:00
James Reed
45962ac5b6 equal() for QuantizedCPU
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24211

Test Plan: Imported from OSS

Differential Revision: D16799801

Pulled By: jamesr66a

fbshipit-source-id: d3c17a7b5305f94217aef2740124506f34fe2458
2019-08-14 13:51:18 -07:00
Jianyu Huang
ec1e53b462 Add dynamic quantized Linear op in PyTorch (#23464)
Summary:
As suggested in https://github.com/pytorch/pytorch/pull/22891, we will add an overload for torch.fbgemm_linear_int8_weight (dynamic quantized version of linear function) that takes PackedLinearWeight as input and is pretty much the same in signature as regular aten::linear.

The previous Diff D16381552 is reverted because `quantize_linear` expects the scale to be `float`, and the zero_point to be `int`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/23464
ghstack-source-id: 88257231

Differential Revision: D16527741

fbshipit-source-id: 66585f668c6e623c50514eb11633bb711d8767f2
2019-08-13 19:59:35 -07:00
Zafar Takhirov
45ca36faaf Add out variant
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23971

Test Plan: Imported from OSS

Differential Revision: D16695592

Pulled By: zafartahirov

fbshipit-source-id: 210dfceae90ac75c53f56bbb96170bdd8e6b8ff3
2019-08-13 17:36:24 -07:00
Daya Khudia
f510409281 Enable FBGEMM tests under UBSAN as well (#23570)
Summary:
Enabling tests under UBSAN as well
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23570

Test Plan:
buck test mode/dev caffe2/test:quantized
```
Running 29 tests
Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/3940649677415136
      ✓ caffe2/test:quantized - test_qtensor (test_quantized_tensor.TestQuantizedTensor) 0.536 1/29 (passed)
      ✓ caffe2/test:quantized - test_qtensor_per_channel_affine (test_quantized_tensor.TestQuantizedTensor) 0.453 2/29 (passed)
      ✓ caffe2/test:quantized - test_qtensor_reshape (test_quantized_tensor.TestQuantizedTensor) 0.302 3/29 (passed)
      ✓ caffe2/test:quantized - test_qadd_relu_same_qparams (test_quantized.TestQuantizedOps) 0.332 4/29 (passed)
      ✓ caffe2/test:quantized - test_qtensor_view (test_quantized_tensor.TestQuantizedTensor) 0.351 5/29 (passed)
      ✓ caffe2/test:quantized - test_qadd_relu_different_qparams (test_quantized.TestQuantizedOps) 0.348 6/29 (passed)
      ✓ caffe2/test:quantized - test_qtensor_dequantize_linear (test_quantized_tensor.TestQuantizedTensor) 0.338 7/29 (passed)
      ✓ caffe2/test:quantized - test_qtensor_copy (test_quantized_tensor.TestQuantizedTensor) 0.267 8/29 (passed)
      ✓ caffe2/test:quantized - test_qtensor_clone (test_quantized_tensor.TestQuantizedTensor) 0.330 9/29 (passed)
      ✓ caffe2/test:quantized - test_qrelu (test_quantized.TestQuantizedOps) 1.774 10/29 (passed)
      ✓ caffe2/test:quantized - test_pool_api (test_nn_quantized.ModuleAPITest) 0.418 11/29 (passed)
      ✓ caffe2/test:quantized - test_qtensor_load_save (test_quantized_tensor.TestQuantizedTensor) 0.724 12/29 (passed)
      ✓ caffe2/test:quantized - test_relu_api (test_nn_quantized.FunctionalAPITest) 1.013 13/29 (passed)
      ✓ caffe2/test:quantized - test_qtensor_quant_dequant (test_quantized_tensor.TestQuantizedTensor) 1.055 14/29 (passed)
      ✓ caffe2/test:quantized - test_qtensor_permute (test_quantized_tensor.TestQuantizedTensor) 0.696 15/29 (passed)
      ✓ caffe2/test:quantized - test_qtensor_dtypes (test_quantized_tensor.TestQuantizedTensor) 0.841 16/29 (passed)
      ✓ caffe2/test:quantized - test_quant_dequant_api (test_nn_quantized.ModuleAPITest) 0.616 17/29 (passed)
      ✓ caffe2/test:quantized - test_qtensor_creation (test_quantized_tensor.TestQuantizedTensor) 0.698 18/29 (passed)
      ✓ caffe2/test:quantized - test_qconv (test_quantized.TestQuantizedConv) 4.743 19/29 (passed)
      ✓ caffe2/test:quantized - test_cat (test_quantized.TestQuantizedOps) 6.992 20/29 (passed)
      ✓ caffe2/test:quantized - test_linear_api (test_nn_quantized.ModuleAPITest) 8.970 21/29 (passed)
      ✓ caffe2/test:quantized - test_conv_api (test_quantized_conv.QuantizedConvTest) 9.403 22/29 (passed)
      ↷ caffe2/test:quantized - test_qnnpack_linear (test_quantized.TestQNNPackOps) 0.000 23/29 (skipped)
Test output:
> Skipped: QNNPACK does not play well with UBSAN at the moment, so we skip the test if we are in a UBSAN environment.
> test_qnnpack_linear (test_quantized.TestQNNPackOps) ... skipped 'QNNPACK does not play well with UBSAN at the moment, so we skip the test if we are in a UBSAN environment.'
>
> ----------------------------------------------------------------------
> Ran 1 test in 0.000s
>
> OK (skipped=1)
      ↷ caffe2/test:quantized - test_qnnpack_relu (test_quantized.TestQNNPackOps) 0.000 24/29 (skipped)
Test output:
> Skipped: QNNPACK does not play well with UBSAN at the moment, so we skip the test if we are in a UBSAN environment.
> test_qnnpack_relu (test_quantized.TestQNNPackOps) ... skipped 'QNNPACK does not play well with UBSAN at the moment, so we skip the test if we are in a UBSAN environment.'
>
> ----------------------------------------------------------------------
> Ran 1 test in 0.000s
>
> OK (skipped=1)
      ✓ caffe2/test:quantized - test_max_pool2d (test_quantized.TestQuantizedOps) 8.453 25/29 (passed)
      ✓ caffe2/test:quantized - test_qlinear_unpack (test_quantized.TestQuantizedLinear) 0.664 26/29 (passed)
      ✓ caffe2/test:quantized - test_qconv_unpack (test_quantized.TestQuantizedConv) 2.965 27/29 (passed)
      ✓ caffe2/test:quantized - test_qlinear (test_quantized.TestQuantizedLinear) 1.915 28/29 (passed)
      ✓ caffe2/test:quantized - test_conv_api (test_nn_quantized.ModuleAPITest) 60.804 29/29 (passed)
      ✓ caffe2/test:quantized - main 0.000 (passed)
Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/3940649677415136
Summary (total time 68.66s):
  PASS: 28
  FAIL: 0
  SKIP: 2
    caffe2/test:quantized - test_qnnpack_linear (test_quantized.TestQNNPackOps)
    caffe2/test:quantized - test_qnnpack_relu (test_quantized.TestQNNPackOps)
  FATAL: 0
  TIMEOUT: 0
  OMIT: 0
```

Reviewed By: jianyuh

Differential Revision: D16569166

Pulled By: dskhudia

fbshipit-source-id: 53522b4162eb1ebb35b408a1503d9664305c85b0
2019-08-12 17:59:22 -07:00
Edward Yang
ce79d5135a Revert D16634539: Enabling inline in quantized relu
Differential Revision:
D16634539

Original commit changeset: 84266f92049c

fbshipit-source-id: 5e1d8e3560483600a61c2ac62b13e9c3fede8301
2019-08-09 08:33:39 -07:00
Zafar Takhirov
9558ccdd76 Enabling inline in quantized relu
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23704

Test Plan: Imported from OSS

Differential Revision: D16634539

Pulled By: zafartahirov

fbshipit-source-id: 84266f92049ce4410ec25821b8d4699a9e3f123e
2019-08-09 02:37:12 -07:00
Daya Khudia
31137738de Support for non-zero zero_points for weight and activation (#23541)
Summary:
We can now have any valid zero points for weight and activation for conv2d kernel.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23541

Test Plan:
buck test mode/dev caffe2/test:quantized -- 'test_qconv\ \(test_quantized.TestQuantizedConv\)'  --print-passing-details
```
Running 1 tests
Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/3377699723897843
      ✓ caffe2/test:quantized - test_qconv (test_quantized.TestQuantizedConv) 68.528 1/1 (passed)
Test output:
> test_qconv (test_quantized.TestQuantizedConv) ... ok
>
> ----------------------------------------------------------------------
> Ran 1 test in 68.529s
>
> OK
Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/3377699723897843
Summary (total time 74.97s):
  PASS: 1
  FAIL: 0
  SKIP: 0
  FATAL: 0
  TIMEOUT: 0
  OMIT: 0
```

Differential Revision: D16556515

Pulled By: dskhudia

fbshipit-source-id: 6e2ee9ddc58f9dc8a3f8b25918bb7955f0655073
2019-08-04 11:05:25 -07:00
Zafar Takhirov
5b4ac841c9 Quantized Average Pool kernel
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23143

Test Plan: Imported from OSS

Differential Revision: D16406281

Pulled By: zafartahirov

fbshipit-source-id: dcd8b58a0ef32b3dcc3337c282c59b4e52091516
2019-07-30 10:51:25 -07:00
Edward Yang
9dad13e1f0 Revert "Add fbgemm_qlinear_dynamic op (#23104)"
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23449

Test Plan: Imported from OSS

Differential Revision: D16524768

Pulled By: ezyang

fbshipit-source-id: 9eb01b021011d1172317b5adb774c10c42ac2b86
2019-07-26 15:02:33 -07:00
Jianyu Huang
47a54295ee Add fbgemm_qlinear_dynamic op (#23104)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23104

ghstack-source-id: 87247148

As suggested in https://github.com/pytorch/pytorch/pull/22891, we will add an overload for ```torch.fbgemm_linear_int8_weight``` (dynamic quantized version of linear function) that takes PackedLinearWeight as input and is pretty much the same in signature as regular aten::linear.

Differential Revision: D16381552

fbshipit-source-id: 1ccc4174fd02c546eee328940ac4b0da48fc85e8
2019-07-26 10:11:56 -07:00
Daya Khudia
bd54608bd2 fused qconv2d+relu kernel (#23353)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23353

Adding support for fused qconv2d + relu

Reviewed By: jianyuh

Differential Revision: D16473318

fbshipit-source-id: cd3c3476a21ffe946dbd9812e833b957c0fd206c
2019-07-25 17:55:47 -07:00
Daya Khudia
6a8c2758d5 Add better performing versions for groupwise and depthwise convolutions (#22869)
Summary:
Groupwise and depthwise convolutions become faster with this diff
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22869

Test Plan:
buck test mode/dev caffe2/test:quantized -- 'test_qconv'  --print-passing-details

```
Running 2 tests
Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/562950091484224
      ✓ caffe2/test:quantized - test_qconv (test_quantized.TestQuantizedConv) 2.731 1/2 (passed)
Test output:
> test_qconv (test_quantized.TestQuantizedConv) ... ok
>
> ----------------------------------------------------------------------
> Ran 1 test in 2.732s
>
> OK
      ✓ caffe2/test:quantized - test_qconv_unpack (test_quantized.TestQuantizedConv) 5.187 2/2 (passed)
Test output:
> test_qconv_unpack (test_quantized.TestQuantizedConv) ... ok
>
> ----------------------------------------------------------------------
> Ran 1 test in 5.188s
>
> OK
Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/562950091484224
Summary (total time 15.66s):
  PASS: 2
  FAIL: 0
  SKIP: 0
  FATAL: 0
  TIMEOUT: 0
  OMIT: 0

```

buck test mode/dev caffe2/test:quantized -- 'test_conv_api'
```
Running 2 tests
Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/3940649676010406
      ✓ caffe2/test:quantized - test_conv_api (test_nn_quantized.ModuleAPITest) 0.040 1/2 (passed)
      ✓ caffe2/test:quantized - test_conv_api (test_quantized_conv.FunctionalAPITest) 5.402 2/2 (passed)
Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/3940649676010406
Summary (total time 11.83s):
  PASS: 2
  FAIL: 0
  SKIP: 0
  FATAL: 0
  TIMEOUT: 0
  OMIT: 0
```

Differential Revision: D16264144

Pulled By: dskhudia

fbshipit-source-id: 32fa43e5c3d97c8aaa6e0858327a2ac0aef8df5c
2019-07-25 17:55:43 -07:00
David Clissold
c23ba35009 Skip QNNpack tests on ppc64le (where support is not enabled) (#23343)
Summary:
Proposed PR for
https://github.com/pytorch/pytorch/issues/23342

Disables execution of QNNpack tests if IS_PPC.
Basically this parallels the same skipping of tests for IS_WINDOWS as well, which is already present.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23343

Differential Revision: D16469218

Pulled By: soumith

fbshipit-source-id: 80b651d00e5d413e359cf418f79e20d74cd9c8e1
2019-07-24 15:24:00 -07:00
Zafar Takhirov
963707c5ea MaxPool2d in the torch (#22765)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22765

the pooling signature is the same as the non-quantized one. Adding it to the native_functions.yaml

Reviewed By: jerryzh168

Differential Revision: D16102608

fbshipit-source-id: 7627ad8f02a231f488b74d1a245b853f89d9c419
2019-07-20 21:41:09 -07:00
Zafar Takhirov
cf3e6478ad Concat with out (#22408)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22408

Quantized Concatenation with out argument

Reviewed By: jianyuh

Differential Revision: D16061526

fbshipit-source-id: 61487cf87763665df19feb8e678da72fd66e8740
2019-07-20 16:13:14 -07:00
Zafar Takhirov
47af41fe72 Quantized concatenation (+fused relu). (#21749)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21749

This is the first version without "requantization"

Reviewed By: jerryzh168

Differential Revision: D15807940

fbshipit-source-id: 19bb0482abed8ed9d1521a3fa1f15bda8e6a6a7c
2019-07-19 22:23:41 -07:00
Zafar Takhirov
992f3860a3 Quantized relu to native_functions (#22316)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22316

Adding the quantized ReLU to the native_functions.yamp, as it has the same signature as non-quantized relu

Reviewed By: jerryzh168

Differential Revision: D16038441

fbshipit-source-id: 1cfbb594eb9bca1b7ec49ca486defcf1908b0d26
2019-07-17 17:31:02 -07:00
Jan Schlüter
5adba33c01 Use integer floor division for pooling shape computation (#22304)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/21935 by using the integer floor division that was introduced for convolution shapes in https://github.com/pytorch/pytorch/issues/9640. Without this fix, the pooling operators can produce a 1-element output in cases they shouldn't.

Disclaimer: I couldn't properly test it locally (it's not picking up the modified version for some reason). I'm marking this WIP until I checked what the CI tools say...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22304

Differential Revision: D16181955

Pulled By: ezyang

fbshipit-source-id: a2405372753572548b40616d1206848b527c8121
2019-07-17 13:23:29 -07:00
Zafar Takhirov
35b6cdc2eb Rewriting hypothesis_utils (#22830)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22830

Separating the tensor generation and the generation of the quantization parameters

- Introducing hypothesis filter `assume_not_overflowing`, which makes sure that the generated tensor and qparams play well with each other. **Note: This is an expensive filter!**
- `qtensor` -> Renameed to `tensor`
- `qtensor_conv` -> Renamed to `tensor_conv2d`
- The tensors don't return the quantization parameters anymore, use `qparams` for it
- The `dtypes` argument is just a quantized dtype now.
- The enforcement for zero_point is predefined as before. As before, if set to `None` the zero_point will be sampled. However, if `None`, you can override sampling with `zero_point_min` and `zero_point_max`
- Scale sampling can also be overriden using `scale_min` and `scale_max`

Reviewed By: jerryzh168

Differential Revision: D16234314

fbshipit-source-id: 5b538a5aa9772b7add4f2ce5eff6fd0decd48f8e
2019-07-17 10:16:13 -07:00
Jianyu Huang
8ec712da30 Add the support of handle Bias being nullptr for torch.ops.quantized.fbgemm_linear (#22403)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22403

- C10 Operator Registration (https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/core/op_registration/op_registration.cpp) supports None type.

- ATen has None Tensor support, e.g., https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/native_functions.yaml#L1078

Reviewed By: zafartahirov

Differential Revision: D16069522

fbshipit-source-id: 3acaec783fc138ff36b14ffc0582d0764be4ad34
2019-07-11 17:33:08 -07:00
Jerry Zhang
1682d38a25 Improve hypothesis_utils.py for qtensor (#22693)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22693

change np.finfo to torch.finfo

Differential Revision: D16185556

fbshipit-source-id: 594f8ba1d6317ac2de47af754a8bd6015d40ea15
2019-07-11 11:56:01 -07:00
Lucas Kabela
3e3e6ee335 Add common_quantized test case utilities (#22694)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22694

Move quantization and quantized utility functions for testing to common_quantized.py and common_quantization.py.  Addditionally, add a quantized test case base class which contains common methods for checking the results of quantization on modules.  As a consequence of the move, fixed the import at the top of test_quantized.py, and test_quantization to use the new utility

Reviewed By: jerryzh168

Differential Revision: D16172012

fbshipit-source-id: 329166af5555fc829f26bf1383d682c25c01a7d9
2019-07-10 12:23:36 -07:00
Supriya Rao
c97829d701 Adding FC and Relu QNNPACK ops to C10 registry (#22174)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22174

This is a preliminary change outlining the approach we plan to follow to integrate QNNPACK operators into the pytorch backend. The operators will not be made visible to the user in the python world, so ultimately we will have a function that calls qnnpack backend based on the environment being run on.

The goal of the project is to integrate QNNPACK library with PyTorch to achieve good performance for quantized mobile models.

Reviewed By: ljk53

Differential Revision: D15806325

fbshipit-source-id: c14e1d864ac94570333a7b14031ea231d095c2ae
2019-07-08 14:21:42 -07:00
Jianyu Huang
4ba1c4f798 Add the support of handle Bias being nullptr for torch.ops.quantized.fbgemm_conv (#22472)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22472

As Title says.

Reviewed By: dskhudia, bddppq

Differential Revision: D16097594

fbshipit-source-id: 7f56b7906dd9c2792e21a8aa553c0b8d05b19012
2019-07-04 19:37:37 -07:00
Dmytro Dzhulgakov
6721e67c10 Remove hacky stub for quantized ops (#22388)
Summary:
Effectively reverts https://github.com/pytorch/pytorch/pull/18267 - this was a temporary measure and is not used any more.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22388

Differential Revision: D16070725

Pulled By: dzhulgakov

fbshipit-source-id: ee5db11a608f248b0da981169d4cc90470fd482f
2019-07-01 23:21:42 -07:00
Daya Khudia
451c907a47 Adding qconv unpack operator for serialization (#22354)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22354

qconv weight unpack operator

Reviewed By: zafartahirov, jianyuh

Differential Revision: D16059668

fbshipit-source-id: b068b1a13bcf6a9148d864db384db780d474bfbf
2019-07-01 09:39:14 -07:00
zaf
e9d1b852c4 Functional conv2d (#21225)
Summary:
Stack:
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; https://github.com/pytorch/pytorch/issues/21323 Quantized Conv2d Module&nbsp;&nbsp;[💛](https://our.intern.facebook.com/intern/diff/D15551835/)
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; **https://github.com/pytorch/pytorch/issues/21225 Functional conv2d**&nbsp;&nbsp;[💛](https://our.intern.facebook.com/intern/diff/D15544061/)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/21225
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21225

Test Plan:
`buck test mode/dev caffe2/test:quantized -- test_conv_api`: https://our.intern.facebook.com/intern/testinfra/testrun/1407375016833929

```
Action graph will be rebuilt because files have been added or removed.
Parsing buck files: finished in 1.1 sec
Building: finished in 5.1 sec (100%) 6958/6958 jobs, 2 updated
  Total time: 6.3 sec
Trace available for this run at /tmp/testpilot.20190603-163323.4026295.log
TestPilot test runner for Facebook. See https://fburl.com/testpilot for details.
Testpilot build revision 17661db57af88ec71497f5c21efa86531c07662b fbpkg ce57c6c1c73f45c4aa890e9df65820c3 at Sat Jun  1 17:06:32 2019 by twsvcscm from /usr/local/fbprojects/packages/testinfra.testpilot/625/t.par
Discovering tests
Running 1 tests
Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/1407375016833929
      ✓ caffe2/test:quantized - test_conv_api (test_quantized_conv.FunctionalAPITest) 6.962 1/1 (passed)
Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/1407375016833929
Summary (total time 10.65s):
  PASS: 1
  FAIL: 0
  SKIP: 0
  FATAL: 0
  TIMEOUT: 0
  OMIT: 0
```

Reviewed By: dskhudia

Differential Revision: D15544061

Pulled By: zafartahirov

fbshipit-source-id: 700c0c78b5915bf7e54bda7c44f44b7b1e247f4d
2019-06-27 09:19:54 -07:00
Jerry Zhang
88921feafd change return type for q_scale and q_zero_point (#21709)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21709

Change the return type from Scalar to double/int64_t so we don't need to do conversion when we call other quantize related aten functions

Differential Revision: D15793003

fbshipit-source-id: 510936c69fa17a4d67340a31ebb03415647feb04
2019-06-20 20:30:39 -07:00
Daya Khudia
7123c6ca04 Enable groupwise for qconv (#21592)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21592

We now support groupwise convolutions for qconv2d

Reviewed By: zafartahirov

Differential Revision: D15739239

fbshipit-source-id: 80b9b4fef5b9ee3d22ebecbaf205b970ab3d4250
2019-06-12 11:03:36 -07:00
Daya Khudia
ee33afe2b1 randomized testing for qconv (#21436)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21436

Test many different options

Reviewed By: zafartahirov

Differential Revision: D15683754

fbshipit-source-id: 60d0fc697b53c7e4adadbe80995d45f28729bca4
2019-06-11 16:07:22 -07:00
Daya Khudia
ec7dc52e60 Fix a bug in qconv (#21294)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21294

Returned output tensor wasn't of correct shape

Reviewed By: zafartahirov

Differential Revision: D15605081

fbshipit-source-id: f79a9d5b93b8b97e79c09411b9dc681987a61e44
2019-06-05 10:19:31 -07:00
Jianyu Huang
0f58d20fe4 Add quantized::fbgemm_linear_unpack operator for serialization (#97)
Summary:
Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/97

Pull Request resolved: https://github.com/pytorch/pytorch/pull/20721

- FBGEMM: Add unpack function for PackBMatrix class: Unpack pmat buffer to the origin_buf (Used for the serialization to recover weight matrix).
- PyTorch Quantizer: Add quantized::fbgemm_linear_unpack operator for serialization.

Reviewed By: zafartahirov

Differential Revision: D15314568

fbshipit-source-id: 12080c8887ce31dc849d23e132ae1766ac319407
2019-06-03 20:36:30 -07:00
Zafar Takhirov
360e6d1b0b Fixes a bug in the test (#21146)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21146

The error was reported by https://our.intern.facebook.com/intern/test/562949965807317?ref_report_id=1837062

The API changed from `a.quantize_linear(...)` to `torch.quantize_linear(a, ...)`

Reviewed By: dskhudia

Differential Revision: D15557418

fbshipit-source-id: 88463e09fdf1f574f1b8128f6a00c2810091cd03
2019-06-01 18:00:33 -07:00
Jerry Zhang
7f960a9c01 remove quantize_linear from Tensor method (#21196)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21196

we'll add `quantize(quantizer)` as a tensor method later when we expose `quantizer` in Python frontend
Python
```
torch.quantize_linear(t, ...)
```
C++
```
at::quantize_linear(t, ...)
```

Differential Revision: D15577123

fbshipit-source-id: d0abeea488418fa9ab212f84b0b97ee237124240
2019-05-31 12:01:10 -07:00
Edward Yang
e161360b62 Revert D15558784: [reland][pt1][quant] remove quantize_linear from Tensor method
Differential Revision:
D15558784

Original commit changeset: 0b194750c423

fbshipit-source-id: d180a7f76bb05ad7470f17bc3d2bd614fab16529
2019-05-31 06:20:05 -07:00
Jerry Zhang
f91f24764e remove quantize_linear from Tensor method (#21156)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21156

we'll add `quantize(quantizer)` as a tensor method later when we expose `quantizer` in Python frontend
Python
```
torch.quantize_linear(t, ...)
```
C++
```
at::quantize_linear(t, ...)
```

Differential Revision: D15558784

fbshipit-source-id: 0b194750c423f51ad1ad5e9387a12b4d58d969a9
2019-05-30 22:02:12 -07:00
Daya S Khudia
726caeace3 Use QTensor for bias (#21038)
Summary:
Use QTesnor for bias tensor
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21038

Differential Revision: D15524980

Pulled By: dskhudia

fbshipit-source-id: c7bf2efc8fe3f4b5574c721c2f64ff073045ecc4
2019-05-30 16:16:03 -07:00
Edward Yang
c4a90ca18e Revert D15477933: [pt1][quant] remove quantize_linear and dequantize from Tensor method
Differential Revision:
D15477933

Original commit changeset: c8aa81f681e0

fbshipit-source-id: ec494fbbab72e20da262bdd8657887e1fdd173cb
2019-05-30 05:04:12 -07:00
Jerry Zhang
67291ba74f remove quantize_linear and dequantize from Tensor method (#20874)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20874

A criteria for what should go in Tensor method is whether numpy has it, for this one it does not
so we are removing it as a Tensor method, we can still call it as function.
Python
```
torch.quantize_linear(t, ...), torch.dequantize(t)
```
C++
```
at::quantize_linear(t, ...), at::dequantize(t)
```

Reviewed By: dzhulgakov

Differential Revision: D15477933

fbshipit-source-id: c8aa81f681e02f038d72e44f0c700632f1af8437
2019-05-29 19:17:16 -07:00
Zafar Takhirov
9daf48525e Quantized Max Pool op (#20474)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20474

parallel implementaiton of the MaxPool (no ReLU).

Reviewed By: dskhudia

Differential Revision: D15327923

fbshipit-source-id: ca6475e7fe1434b55d4b7730a074bb7ff50355fd
2019-05-29 15:01:01 -07:00
Zafar Takhirov
2791a44948 Renaming the relu kernel and adding hypothesis tests (#20647)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20647

The initial assumption was that `qint8` would be unsigned. After introduction of `quint8` and `qint8`, some tests break.

Reviewed By: jerryzh168

Differential Revision: D15332106

fbshipit-source-id: 6ed18da428915aea918a363c5f38754a3c75d06b
2019-05-28 16:46:44 -07:00
Will Feng
8cde4c4d22 Remove Variable::Impl and DifferentiableViewImpl (#17072)
Summary:
As part of the Variable/Tensor merge work: https://github.com/pytorch/pytorch/issues/13638, we make the following changes in this PR:
1. Remove the `Variable::Impl` class and the `DifferentiableViewImpl` class
2. Change all `Variable.data()` call sites to either use `Variable` directly, or use `Variable.tensor_data()`
3. Remove `Variable.data()` API
3. Add `Variable.variable_data()` that matches `tensor.data` in Python API, which creates a new `Variable` that shares the same storage and tensor metadata with the original `Variable`, but with a completely new autograd history.

After this PR, Variable doesn't wrap a Tensor internally anymore, and both Variable and Tensor use the same TensorImpl class as its `impl_`. The only difference is that Variable always has AutogradMeta in its TensorImpl, but Tensor doesn't.

**Note that this PR is BC-breaking in the following use cases:**

**Use Case 1:**
Previously, `x.data = y` works even if `x` and `y` are of different TensorImpl type (e.g. `x` is a CPU dense tensor whose impl is of type TensorImpl, while `y` is a CPU sparse tensor whose impl is of type SparseTensorImpl). However, after this PR, `x.data = y` doesn't work anymore if `x` and `y` are of different TensorImpl type, because the underlying implementation `variable.set_data(tensor)` no longer works if `variable` and `tensor` have different TensorImpl type.

**Use Case 2:**
If a tensor `x`'s `grad` is sparse, accumulating dense gradients to `x` will change the tensor that `x.grad` is pointing to. This is better illustrated with the following example:
```python
params = torch.tensor([1.5, 1.5]).requires_grad_()
with torch.no_grad():
    # Change gradient to a sparse tensor
    params.grad = torch.sparse_coo_tensor(torch.tensor([[1, 1]]).long(), torch.tensor([1., 1.]))

grad_saved = params.grad
params.backward(torch.tensor([1.5, 1.5]))
assert id(grad_saved) == id(params.grad)  # This will fail after this PR
```
The assertion in the last line will fail after this PR, because adding dense gradients to sparse gradients will change the `params.grad` tensor reference.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17072

Differential Revision: D14075257

Pulled By: yf225

fbshipit-source-id: 0e681df641270dea586042dd26db59f2e76b5957
2019-05-23 21:09:04 -07:00
Daya S Khudia
cde611a66c Quantized Conv2d operator (#20772)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20772

Copy of D15178352

A conflicting commit landed at the same time as D15178352 that removed registering kernels using IntArrayRef, Hence, D15178352 was revered. Using std::vector instead.

Reviewed By: zafartahirov

Differential Revision: D15437237

fbshipit-source-id: cd2f1caebcc720352b48ce25d716cb1ca49a5197
2019-05-22 17:53:24 -07:00
Jianyu Huang
4a85e7955c Rename FC to Linear in the test routine (#20716)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20716

As Title says.

Reviewed By: zafartahirov

Differential Revision: D15410823

fbshipit-source-id: e82fc241ee288b41304675cb087c0cdcd60d7148
2019-05-21 19:58:19 -07:00
Jianyu Huang
e6f22e1b89 Change Bias to QTensor with qint32(int32_t) (#20713)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20713

As Title says.

Reviewed By: zafartahirov

Differential Revision: D15410734

fbshipit-source-id: c00f409278736cf9e3205f7d36dda1b96120f47d
2019-05-21 14:17:37 -07:00
Jianyu Huang
b9a150ede0 Change Weight to QTensor with qint8(int8_t) (#20712)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20712

As Title says.

Differential Revision: D15410696

fbshipit-source-id: 48147a79d8cc47a724eb473796a37a1c64f8e883
2019-05-21 14:17:34 -07:00
Jesse Hellemn
fac307a5cf Revert D15178352: [pt1][quant] Quantized Conv2d operator
Differential Revision:
D15178352

Original commit changeset: 2e5453283137

fbshipit-source-id: 73cf64c483eedbd41a047e7593c0c92bbd33008c
2019-05-21 09:59:57 -07:00
Daya S Khudia
29b1b59449 Quantized Conv2d operator (#20064)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20064

Initial implementation of quantized convolution operator using fbgemm.

Reviewed By: zafartahirov

Differential Revision: D15178352

fbshipit-source-id: 2e5453283137dc165e9a20164ffc138fa8caf88a
2019-05-21 09:13:42 -07:00