Commit Graph

47 Commits

Author SHA1 Message Date
James Reed
812b1ad869 [quantization] FP16 dynamic quantized Linear
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32331

Test Plan: Imported from OSS

Differential Revision: D19441158

Pulled By: jamesr66a

fbshipit-source-id: c04247ffe707be68718c486c31bc6c6040f7dc11
2020-01-27 15:45:32 -08:00
Jianyu Huang
0bebfe2143 Add the explicit per-tensor/per-channel quant info when we print the module (#30591)
Summary:
As Title says. We would like to explicitly distinguish per-tensor/per-channel scheme when we print the module.

Here is an example for Lenet after applying the per-channel dynamic quantization:

Before this PR:
```
FloatModel(
  (conv1): Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(20, 50, kernel_size=(5, 5), stride=(1, 1))
  (fc1): DynamicQuantizedLinear(
    in_features=800, out_features=500
    (_packed_params): LinearPackedParams()
  )
  (fc2): DynamicQuantizedLinear(
    in_features=500, out_features=10
    (_packed_params): LinearPackedParams()
  )
)
```

After this PR:
```
FloatModel(
  (conv1): Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(20, 50, kernel_size=(5, 5), stride=(1, 1))
  (fc1): DynamicQuantizedLinear(
    in_features=800, out_features=500, qscheme=torch.per_channel_affine
    (_packed_params): LinearPackedParams()
  )
  (fc2): DynamicQuantizedLinear(
    in_features=500, out_features=10, qscheme=torch.per_channel_affine
    (_packed_params): LinearPackedParams()
  )
)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30591

Differential Revision: D18764366

Pulled By: jianyuh

fbshipit-source-id: e897ab42ace6b82b2a90729ba788313c7873de1a
2019-12-02 20:14:46 -08:00
James Reed
05a1644ce3 Fix BC for quantized linear
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30481

Test Plan: Imported from OSS

Differential Revision: D18714602

Pulled By: jamesr66a

fbshipit-source-id: d51206c22cf2446e98053446789c6324c0481321
2019-11-26 17:38:09 -08:00
James Reed
97fae401f0 Use LinearPackedParams everywhere
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30198

Test Plan: Imported from OSS

Differential Revision: D18628003

Pulled By: jamesr66a

fbshipit-source-id: 76ff0248fd859e805a15cde555d26dd2138636fa
2019-11-22 11:31:17 -08:00
Zafar Takhirov
675a4cb9fb Extracted quantize/dequantize out of linear.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29173

Test Plan: Imported from OSS

Differential Revision: D18318561

Pulled By: z-a-f

fbshipit-source-id: 89317bb5f56e31221ed9ed02bf727ce39f44ebf8
2019-11-08 14:35:15 -08:00
Zafar Takhirov
a5ac7f6387 Changing observer name
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27779

Test Plan: Imported from OSS

Differential Revision: D17886605

Pulled By: z-a-f

fbshipit-source-id: 68c50b482e65015336ff27171fd730da493525b6
2019-10-17 11:36:03 -07:00
Michael Suo
341262754f module dedupe (#26666)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26666

Changes:
- Introduce a `ConcreteModuleType` concept. This acts both as the key into the type
  cache, and as the source of truth for `ModuleValue::attr` queries. It needs
  to do both jobs because that's how we ensure correctness (if the types are
  different, it's because `ModuleValue::attr` would return different things).
- Now `recursive_script` will first construct a `ConcreteModuleType` and search for a
  pre-existing type before starting compilation.
- All previous paths to creating a `ScriptModule` (including inheriting from
  `ScriptModule`) are now rewritten to go through `create_script_module`, so
  that we have only a single place where construction happens.

Behavioral changes:
- Big change to `torch.jit.ScriptModule` inheritance: all attributes are now
  recursively scripted if possible, matching recursive scripting semantics.
  This makes it hard to keep something from being scripted (for example, a
  Python submodule). Possibly we'll need an `ignore()` type thing for
  attributes. In particular, this adds `self.training` to *every* ScriptModule, since
  it's present on every `nn.Module`.
- I believe this change to be transparent to existing users of the inheritance API, since if you had an attribute that is unscriptable that you never used, there is no error. In some cases, we will create new attributes (even if they are unused), which will increase serialized model size from before.

Test Plan: Imported from OSS

Differential Revision: D17551196

Pulled By: suo

fbshipit-source-id: b476d1c9feb3ddfd63406d90989aaf9dfe890591
2019-10-12 09:51:57 -07:00
davidriazati
0046092178 Reduce special casing around 'training' (#27109)
Summary:
Most of this was old cruft left over from special handling of `training` before we had a `bool` type. This makes all modules have a `training` attribute that is true by default and removes all other special handling.

Fixes #26884
](https://our.intern.facebook.com/intern/diff/17728129/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27109

Pulled By: driazati

Differential Revision: D17728129

fbshipit-source-id: 8ddc9fbb07a953dd05529538bfdd01ed88b5cb57
2019-10-07 13:52:59 -07:00
Zafar Takhirov
27dc595215 Rename _intrinsic to intrinsic
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27194

Test Plan: Imported from OSS

Differential Revision: D17704957

Pulled By: zafartahirov

fbshipit-source-id: 46f02d129aa77c3047b2a6c606bfadd831a6b0fc
2019-10-02 18:53:06 -07:00
Raghuraman Krishnamoorthi
dddae3f854 Fuse module enhancements (#26457)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26457

Enhancement to fuse module to support sequentials, fuse list can now be just like the state dict.
Also add support for Conv-Relu and linear-relu fusion
Also support inplace and out of place fusion of models.
ghstack-source-id: 91076386

Test Plan:
buck test caffe2/test:quantization -- 'test_fusion_sequential_model_train \(test_quantization\.FusionTest\)' --print-passing-details
buck test caffe2/test:quantization -- 'test_fusion_sequential_model_eval \(test_quantization\.FusionTest\)' --print-passing-details

Differential Revision: D17466382

fbshipit-source-id: 0a548f8f4c366f3ecc59db693bac725ccd62328e
2019-09-30 22:00:20 -07:00
James Reed
4d7bec5f3e Improve repr for quantized modules
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27008

Test Plan: Imported from OSS

Differential Revision: D17649174

Pulled By: jamesr66a

fbshipit-source-id: e3e6c4bb31e1ad8ed1ebe27f803f90d564ecfe53
2019-09-28 15:15:14 -07:00
Raghuraman Krishnamoorthi
2ccbdb79c8 Per-channel baseline (#26516)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26516

ghstack-source-id: 90982010

Test Plan:
Integrate per-channel support into conv and linear modules.
The following tests pass:
buck test caffe2/test:quantized -- 'test_linear_api \(test_quantized_nn_mods\.ModuleAPITest\)' --print-passing-details

buck test caffe2/test:quantized -- 'test_conv_api \(test_quantized_nn_mods\.ModuleAPITest\)' --print-passing-details

buck test caffe2/test:quantized -- 'test_float_quant_compare_per_channel \(test_quantized_models\.ModelNumerics\)' --print-passing-details

Differential Revision: D17342622

fbshipit-source-id: f0d618928e3d9348672c589a6b7a47049c372a2e
2019-09-28 14:05:06 -07:00
James Reed
df16fb9ca1 Throw if someone tries to torch.save() quantized modules (#26828)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26828

Pickle serialization for quantized modules is currently broken by https://github.com/pytorch/pytorch/issues/24045, so let's be loud and fail if the user tries to do it

Test Plan: Imported from OSS

Differential Revision: D17579127

Pulled By: jamesr66a

fbshipit-source-id: 3deccac7e4590c6f648f22bb79c57badf3bf0487
2019-09-25 19:55:17 -07:00
Jerry Zhang
254122dd4e quantize_linear -> quantize_per_tensor (#26574)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26574

Since we also have `quantized::linear`, `quantize_linear` sounds
confusing, so we plan to rename it before the branch cut

Test Plan:
ci

Imported from OSS

Differential Revision: D17514876

fbshipit-source-id: 01d9005e6ec8cb9950b9d8bba122109c389641d3
2019-09-20 21:58:48 -07:00
Daya Khudia
2b52c1d982 Dynamic quantization for bias. (#26057)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26057

bias is now unquantized (i.e. floating type) for qconv and qlinear. It is dynamically quantized by fbgemm.

TODO: Add some performance numbers.

Tests:

test:quantization
```
Summary (total time 8.41s):
  PASS: 24
  FAIL: 0
  SKIP: 0
  FATAL: 0
  TIMEOUT: 0More details at https://our.intern.facebook.com/intern/buck/build/74d5f6f7-55c9-4350-a618-2013042fffd8

  OMIT: 0
```

test:quantized
```
Summary (total time 13.21s):
  PASS: 43
  FAIL: 0
  SKIP: 5
    caffe2/test:quantized - test_qnnpack_maxpool2d (test_quantized.TestQNNPackOps)
    caffe2/test:quantized - test_compare_tensor_scalar (test_quantized.TestComparatorOps)
    caffe2/test:quantized - test_qnnpack_linear (test_quantized.TestQNNPackOps)
    caffe2/test:quantized - test_qnnpack_relu (test_quantized.TestQNNPackOps)
    caffe2/test:quantized - test_qnnpack_add (test_quantized.TestQNNPackOps)
  FATAL: 0
  TIMEOUT: 0
  OMIT: 0
```
ghstack-source-id: 90166254

Test Plan:
buck test mode/dev caffe2/test:quantization

buck test mode/dev caffe2/test:quantized

Differential Revision: D17328028

fbshipit-source-id: d4a163d730d0f4a03e8e0faf7420710cf36eec09
2019-09-16 14:43:06 -07:00
Supriya Rao
9d2d31e626 Store bias in PackedLinearWeight struct in fbgemm (#25428)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25428

Added bias as an optional param to the quantized_linear_prepack function.
Bias is quantized during runtime using input scale and weight scale.
ghstack-source-id: 89601399

Test Plan: python test/run_test.py --exclude nn --verbose --bring-to-front quantization quantized quantized_tensor quantized_nn_mods quantizer

Differential Revision: D17121304

fbshipit-source-id: 8adb0e55e4aed0a5430aaa2c8639c8ad1639c85a
2019-09-06 08:37:34 -07:00
Supriya Rao
61819260f7 Rename FBGEMM quantized operators to generic quantized ops (#25678)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25678

As an effort to unify fbgemm and qnnpack at the dispatcher level, we need to have a generic name for the quantized backed ops.
Currently FBGEMM is guarded by the USE_FBGEMM macro and QNNPACK uses USE_QNNPACK.
ghstack-source-id: 89518961

Test Plan: buck test caffe2/test:quantized

Differential Revision: D17194364

fbshipit-source-id: 5960aedff6b8cb89eb3872c39b74caf54c0fbf20
2019-09-05 10:13:08 -07:00
Edward Yang
55da02a86d Revert D17097735: [quantization] Rename fbgemm quantized operators to generic quantized ops
Test Plan: revert-hammer

Differential Revision:
D17097735

Original commit changeset: 447112a7a421

fbshipit-source-id: 78368b6f84d96cea70692fb000cebe99602a08c1
2019-09-04 15:02:32 -07:00
Supriya Rao
c9ba5186d3 Rename fbgemm quantized operators to generic quantized ops (#25338)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25338

As an effort to unify fbgemm and qnnpack at the dispatcher level, we need to have a generic name for the quantized backed ops.
Currently FBGEMM is guarded by the USE_FBGEMM macro and QNNPACK uses USE_QNNPACK.

TBD: Use compile time macro or run_time to switch between fbgemm and qnnpack.
ghstack-source-id: 89454244

Test Plan: buck test caffe2/test:quantized

Differential Revision: D17097735

fbshipit-source-id: 447112a7a421387724d3e29b8fd8412dfb1c373a
2019-09-04 14:27:27 -07:00
Raghuraman Krishnamoorthi
9945c0cea6 Work around for bias quantization for conv and linear operators (#25212)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25212

In eager mode, all modules need to work with input tensors that can change qparams dynamically. This issue https://github.com/pytorch/pytorch/issues/23874 will address this via FBGEMM modifications. This is a work around before that.
ghstack-source-id: 89118038

Test Plan:
buck test caffe2/test:quantized -- 'test_conv_api \(test_quantized_nn_mods\.ModuleAPITest\)' --print-passing-details
Summary (total time 65.86s):
  PASS: 1
  FAIL: 0
  SKIP: 0
  FATAL: 0
  TIMEOUT: 0
  OMIT: 0

Differential Revision: D17064471

fbshipit-source-id: 3c192442b19bf2d9d88d4e52de6c24dc134a846f
2019-08-28 07:24:03 -07:00
Raghuraman Krishnamoorthi
26a438d4fb Revert D16852280: Work around for bias quantization for conv and linear operators
Test Plan: revert-hammer

Differential Revision:
D16852280

Original commit changeset: 988f8ff91616

fbshipit-source-id: e2cf03e13dc8dcf0db22d43740d72fd8b069fd74
2019-08-26 16:25:33 -07:00
Raghuraman Krishnamoorthi
ea601d90d6 Work around for bias quantization for conv and linear operators (#24789)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24789

In eager mode, all modules need to work with input tensors that can change qparams dynamically. This issue https://github.com/pytorch/pytorch/issues/23874 will address this via FBGEMM modifications. This is a work around before that.
ghstack-source-id: 89003798

Test Plan:
buck test caffe2/test:quantized -- 'test_conv_api \(test_quantized_nn_mods\.ModuleAPITest\)' --print-passing-details
Summary (total time 65.86s):
  PASS: 1
  FAIL: 0
  SKIP: 0
  FATAL: 0
  TIMEOUT: 0
  OMIT: 0

Differential Revision: D16852280

fbshipit-source-id: 988f8ff91616eddf511e71926aa7d2d0f1938188
2019-08-26 12:16:42 -07:00
James Reed
a0b13b4fa5 extra_repr for quantized modules (#24443)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24443

This gives us useful information about the Module when we print it, like so:

```
FloatModule(
  (quant): Quantize()
  (conv1): Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1), scale=0.08209919929504395, zero_point=128)
  (conv2): Conv2d(20, 50, kernel_size=(5, 5), stride=(1, 1), scale=0.16885940730571747, zero_point=128)
  (fc1): Linear(in_features=800, out_features=500, bias=True, scale=0.12840059399604797, zero_point=128)
  (fc2): Linear(in_features=500, out_features=10, bias=True, scale=0.260015606880188, zero_point=128)
  (dequant): DeQuantize()
)
```

Test Plan: Imported from OSS

Differential Revision: D16847140

Pulled By: jamesr66a

fbshipit-source-id: 8c995108f17ed1b086d1fb30471a41c532c68080
2019-08-16 22:38:45 -07:00
James Reed
a1b111709d Assert weight_observer has the correct dtype
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24436

Test Plan: Imported from OSS

Differential Revision: D16847141

Pulled By: jamesr66a

fbshipit-source-id: 1dde5c26449115b53e71d410b41204d743787c44
2019-08-15 19:40:14 -07:00
Jerry Zhang
754bf383b1 Change return type of observer to two tensors (#24339)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24339

Att

Differential Revision: D16820813

fbshipit-source-id: 3e7301f1700176e19f46e8677a644ba167209254
2019-08-15 10:26:44 -07:00
Jerry Zhang
761ae8e9b6 Add intrinsic module mappings (#23753)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23753

Add intrinsic(fused) module mappings in quantize.py to enable mapping fused modules
in both QAT and post PTQ

Differential Revision: D16820749

fbshipit-source-id: 07de76a4f09b44bde8b193c103eac02c22b875b6
2019-08-15 09:37:24 -07:00
James Reed
7923884a03 Fix incorrect type annotation on Linear __setstate__
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24209

Test Plan: Imported from OSS

Differential Revision: D16777886

Pulled By: jamesr66a

fbshipit-source-id: 4f75b3c16458f093a5ae658d36dcb7a6d313410a
2019-08-12 19:21:41 -07:00
James Reed
f66bfa7ec4 state_dict serialization for Conv2d + some bugfixes
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24116

Test Plan: Imported from OSS

Reviewed By: eellison

Differential Revision: D16765476

Pulled By: jamesr66a

fbshipit-source-id: 96275cea87d7f5e7de5d1925cbce220066f1a465
2019-08-12 16:24:54 -07:00
James Reed
a45dafc66a JIT Serialization of nnq.Linear (#24048)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24048

Add `__{g,s}etstate__ methods on `nnq.Linear` for JIT (and torch.{save,load} serialization).

Unfortunately, this unearthed a bug in serialization documented in https://github.com/pytorch/pytorch/issues/24045. The check that triggered the bug has been disabled pending a fix

Test Plan: Imported from OSS

Reviewed By: driazati

Differential Revision: D16728347

Pulled By: jamesr66a

fbshipit-source-id: c3b850be3b831f4c77cec3c2df626151b2af8b34
2019-08-09 17:14:58 -07:00
James Reed
ca2010cfea State dict serialization of nnq.Linear (#24047)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24047

Add `_{save_to,load_from}_state_dict` methods to `nnq.Linear` that explicitly deal with conversions from the Python attributes to the serialized state dict form

Test Plan: Imported from OSS

Reviewed By: driazati

Differential Revision: D16728346

Pulled By: jamesr66a

fbshipit-source-id: 182c9f5069d509147dc9020b341b6cb87505fe7f
2019-08-09 17:14:52 -07:00
James Reed
442b3512d4 Simplified nnq.Linear class (#24046)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24046

`nnq.Linear` was a confusing mess of buffers/attributes and Tensor/not tensor members. This PR reworks it to consistently have only Python attributes, with the conversions handled explicitly by state_dict or __{get,set}state__ methods (added in PRs further up the stack

Test Plan: Imported from OSS

Reviewed By: driazati

Differential Revision: D16728345

Pulled By: jamesr66a

fbshipit-source-id: 47468b776b428fca2409bb55c8b161afb68a3379
2019-08-09 17:14:48 -07:00
James Reed
a35d2902ef jit.script() testing and fixes (#23891)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23891

This adds an initial set of testing coverage for quantization that checks if the modules can be scripted. Testing for tracing and serialization is forthcoming

Test Plan: Imported from OSS

Differential Revision: D16698045

Pulled By: jamesr66a

fbshipit-source-id: 96d80d938b816220af72359165a7b96d998a30c9
2019-08-08 12:06:18 -07:00
Zafar Takhirov
5e4c24baef Documentation cleanup
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23148

Test Plan: Imported from OSS

Differential Revision: D16414202

Pulled By: zafartahirov

fbshipit-source-id: a999be0384a2ff5272dd2f8adcf87547ce6ee9dd
2019-07-31 11:30:44 -07:00
Jerry Zhang
77353636de Conv module (#23084)
Summary:
Added Conv module for qat

Pull Request resolved: https://github.com/pytorch/pytorch/pull/23084
ghstack-source-id: 86862445

Differential Revision: D16379417

fbshipit-source-id: 742cc8b8e0f132070ca4943a1c2e3db60c2b5bdc
2019-07-19 18:49:52 -07:00
Jerry Zhang
7cc029cb75 Quantization aware training in eager mode (#23082)
Summary:
Add support for quantization aware training in eager mode

Modifications to Post training flow:
## Prepare
* Fusion: e.g. (Conv, Bn) → ConvBn (float)
* Swapping: To insert fake_quant to weight, we need to swap the float modules that has weight with different qat modules, e.g. Conv → torch.nn.qat.Conv , ConvBn → torch.nn._intrinsic.qat.ConvBn
```
    * previously we were thinking about modify the weight in forward_pre hook and change it back in forward_hook:
        * def forward_pre_hook(self, input):
                self.float_weight = self.weight
                self.weight = self.fake_quantize(self.float_weight)

            def forward_hook(self, input):
                self.weight = self.float_weight
```

* Assignments to self.weight are needed because we can’t change forward function and in forward function they are using self.weight.
* But we will need to keep two copies of weight in this case, so it’s probably better to just swap the module
* So we want to just swap Conv to torch.nn.qat.Conv and Linear to torch.nn.qat.Linear
* qat modules will have fake_quant for output and weights inserted in forward function

## Convert
* flow should be identical to ptq, but the swapping dictionary is slightly different since modules are changed in prepare step.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/23082
ghstack-source-id: 86824650

Differential Revision: D16379374

fbshipit-source-id: 7d16d1acd87025065a24942ff92abf18e9fc8070
2019-07-19 14:57:25 -07:00
Soumith Chintala
84c2c89e2c Revert D16199356: [qat] Quantization aware training in eager mode
Differential Revision:
D16199356

Original commit changeset: 62aeaf47c12c

fbshipit-source-id: d06a96b0a617ae38029ffb246173ec065454b666
2019-07-19 03:18:48 -07:00
Soumith Chintala
f19aa12ae5 Revert D16274792: [qat] Conv module
Differential Revision:
D16274792

Original commit changeset: 1da10194123b

fbshipit-source-id: 71b34774b463f2350289bd39b8cfd798e095ffa5
2019-07-19 03:18:45 -07:00
Jerry Zhang
12d9d768b8 Conv module (#22899)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22899

Added Conv module for qat

Reviewed By: zafartahirov

Differential Revision: D16274792

fbshipit-source-id: 1da10194123b2759a6a35c60d1c2d2c0b569ccdc
2019-07-18 18:58:07 -07:00
Jerry Zhang
65ef671d11 Quantization aware training in eager mode (#22732)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22732

Add support for quantization aware training in eager mode

Modifications to Post training flow:
## Prepare
* Fusion: e.g. (Conv, Bn) → ConvBn (float)
* Swapping: To insert fake_quant to weight, we need to swap the float modules that has weight with different qat modules, e.g. Conv → torch.nn.qat.Conv , ConvBn → torch.nn._intrinsic.qat.ConvBn
```
    * previously we were thinking about modify the weight in forward_pre hook and change it back in forward_hook:
        * def forward_pre_hook(self, input):
                self.float_weight = self.weight
                self.weight = self.fake_quantize(self.float_weight)

            def forward_hook(self, input):
                self.weight = self.float_weight
```

* Assignments to self.weight are needed because we can’t change forward function and in forward function they are using self.weight.
* But we will need to keep two copies of weight in this case, so it’s probably better to just swap the module
* So we want to just swap Conv to torch.nn.qat.Conv and Linear to torch.nn.qat.Linear
* qat modules will have fake_quant for output and weights inserted in forward function

## Convert
* flow should be identical to ptq, but the swapping dictionary is slightly different since modules are changed in prepare step.

Reviewed By: zafartahirov

Differential Revision: D16199356

fbshipit-source-id: 62aeaf47c12c62a87d9cac208f25f7592e245d6c
2019-07-18 18:58:03 -07:00
Jianyu Huang
8ec712da30 Add the support of handle Bias being nullptr for torch.ops.quantized.fbgemm_linear (#22403)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22403

- C10 Operator Registration (https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/core/op_registration/op_registration.cpp) supports None type.

- ATen has None Tensor support, e.g., https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/native_functions.yaml#L1078

Reviewed By: zafartahirov

Differential Revision: D16069522

fbshipit-source-id: 3acaec783fc138ff36b14ffc0582d0764be4ad34
2019-07-11 17:33:08 -07:00
Zafar Takhirov
d21e476dcd Quantized Conv2d Module (#21323)
Summary:
Stack:
      https://github.com/pytorch/pytorch/issues/21808 Quantized conv avoid functional usage  [💛](https://our.intern.facebook.com/intern/diff/D15835572/)
      **https://github.com/pytorch/pytorch/issues/21323 Quantized Conv2d Module**  [💛](https://our.intern.facebook.com/intern/diff/D15551835/)

Quantized Conv2d Module
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21323

Test Plan:
Tests are split into two parts: functional and API.

`buck test mode/dev caffe2/test:quantized -- test_conv_api` : https://our.intern.facebook.com/intern/testinfra/testrun/4785074605318491

```
Parsing buck files: finished in 1.4 sec
Building: finished in 4.6 sec (100%) 7136/7136 jobs, 2 updated
  Total time: 6.1 sec
Trace available for this run at /tmp/testpilot.20190703-153023.392592.log
TestPilot test runner for Facebook. See https://fburl.com/testpilot for details.
Testpilot build revision 7149de230b9e1cdc7a872bb31fe099f0616dee09 fbpkg e59e6ab0fe8e47a496f915d34555c3ad at Fri Jun 28 12:20:54 2019 by twsvcscm from /usr/local/fbprojects/packages/testinfra.testpilot/647/t.par
Discovering tests
Running 2 tests
Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/4785074605318491
      ✓ caffe2/test:quantized - test_conv_api (test_nn_quantized.ModuleAPITest) 0.044 1/2 (passed)
      ✓ caffe2/test:quantized - test_conv_api (test_quantized_conv.FunctionalAPITest) 5.109 2/2 (passed)
Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/4785074605318491
Summary (total time 9.08s):
  PASS: 2
  FAIL: 0
  SKIP: 0
  FATAL: 0
  TIMEOUT: 0
  OMIT: 0
```

Differential Revision: D15551835

Pulled By: zafartahirov

fbshipit-source-id: 481a7df4b8a88e485437e1596eefb08d5e6766fa
2019-07-10 21:31:24 -07:00
Jerry Zhang
5040d52a5a torch.quantization conversion utilities, observers for eager mode quantization (#22010)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22010

torch.quantization module with observers and conversion routines

Reviewed By: zafartahirov

Differential Revision: D15554183

fbshipit-source-id: 05a3fabe28dd701978b8ecebf5bfc3a4c044ba5c
2019-07-09 10:51:38 -07:00
David Riazati
10c4b98ade Remove weak script (#22212)
Summary:
* Deletes all weak script decorators / associated data structures / methods
   * In order to keep supporting the standard library in script, this enables recursive script on any function defined in `torch.nn`
   * Most changes in `torch/nn` are the result of `ag -Q "weak" torch/nn/ -l | xargs sed -i '/weak/d'`, only `rnn.py` needed manual editing to use the `ignore` and `export` to continue supporting the overloaded `forward` methods
* `Sequential`/`ModuleList` no longer need to be added to constants since they are compiled on demand

This should also fix https://github.com/pytorch/pytorch/issues/22212
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22212

Differential Revision: D15988346

Pulled By: driazati

fbshipit-source-id: af223e3ad0580be895377312949997a70e988e4f
2019-07-03 17:28:25 -07:00
Jerry Zhang
0804452709 fix lint in torch/nn/quantized/modules/linear.py (#22325)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22325

att

Reviewed By: bddppq

Differential Revision: D16042464

fbshipit-source-id: 0610896c08667fdaa95983f49140193ecb9ede16
2019-06-27 23:18:42 -07:00
Jerry Zhang
5e77111486 nn.quantized.Relu and nn.quantize.Quantize/DeQuantize modules
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21930

Differential Revision: D15554224

fbshipit-source-id: 1de9ac7412468106be60e53852c23318ead37bc6
2019-06-27 16:15:17 -07:00
Jerry Zhang
2832e33a94 Add serialization for nn.quantized.Linear module (#21925)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21925

att

Differential Revision: D15483071

fbshipit-source-id: 3a218dad5b653b38a0885339889ff70c75a13bef
2019-06-27 14:57:22 -07:00
Jerry Zhang
5c46e701fc Implementation of nn.quantized.linear module (#21921)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21921

Call FBGEMM kernels to implement quantized linear operator. This operator is used only for inference.

Differential Revision: D15375695

fbshipit-source-id: b9ca6c156fd60481fea83e55603b2897f7bfc3eb
2019-06-27 14:09:48 -07:00