Commit Graph

350 Commits

Author SHA1 Message Date
James Reed
c941dd3492 [FX] s/get_param/get_attr/ (#45000)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45000

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D23798016

Pulled By: jamesr66a

fbshipit-source-id: 1d2f3db1994a62b95d0ced03bf958e54d30c35dd
2020-09-21 14:09:32 -07:00
Vasiliy Kuznetsov
2163d31016 histogram observer: ensure buffer shape consistency (#44956)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44956

Makes buffer shapes for HistogramObserver have the
same shapes in uninitialized versus initialized states.

This is useful because the detectron2 checkpointer assumes
that these states will stay the same, so it removes the
need for manual hacks around the shapes changing.

Test Plan:
```
python test/test_quantization.py TestObserver.test_histogram_observer_consistent_buffer_shape
```

Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D23785382

fbshipit-source-id: 1a83fd4f39b244b00747c368d5d305a07d877c92
2020-09-19 09:29:39 -07:00
Supriya Rao
1fde54d531 [quant][qat] Ensure fake_quant and observer can be disabled on scriptmodule (#44773)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44773

The model is created and prepared using fx APIs and then scripted for training.
In order to test QAT on scriptmodel we need to be able to disable/enable fake_quant
and observer modules on it.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_qat_and_script

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D23741354

fbshipit-source-id: 3fee7aa9b049d9901313b977710f4dc1c4501532
2020-09-17 10:21:52 -07:00
Supriya Rao
361b38da19 [quant][fx] Add node name as prefix to observer module name (#44765)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44765

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_save_observer_state_dict

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D23741355

fbshipit-source-id: 7185ceae5b3b520ac0beebb627c44eab7ae7d231
2020-09-17 10:17:42 -07:00
Xiang Gao
20ac736200 Remove py2 compatible future imports (#44735)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44735

Reviewed By: mruberry

Differential Revision: D23731306

Pulled By: ezyang

fbshipit-source-id: 0ba009a99e475ddbe22981be8ac636f8a1c8b02f
2020-09-16 12:55:57 -07:00
Supriya Rao
3f512b0de2 [quant][qat] Ensure observers and fq modules are scriptable (#44749)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44749

Ensure fx module is scriptable after calling prepare_qat on it

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_qat_and_script

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D23718380

fbshipit-source-id: abf63ffb21e707f7def8f6c88246877f5aded58c
2020-09-16 09:30:07 -07:00
Jerry Zhang
e594c30bc2 [quant][graphmode][fx] Support fp16 dynamic quantization for linear (#44582)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44582

Test Plan:
test_quantize_fx.py

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23665974

fbshipit-source-id: 19ba6c61a9c77ef570b00614016506e9a2729f7c
2020-09-14 15:43:08 -07:00
Zafar
742654d1b6 [quant] ConvTranspose1d / ConvTranspose2d (#40371)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40371

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D22158981

Pulled By: z-a-f

fbshipit-source-id: defbf6fbe730a58d5b155dcb2460dd969797215c
2020-09-14 14:25:06 -07:00
Jerry Zhang
b6f0ea0c71 [quant][graphmode][fx][fix] Remove qconfig in convert (#44526)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44526

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23641960

fbshipit-source-id: 546da1c16694d1e1dfb72629085acaae2165e759
2020-09-11 15:51:47 -07:00
Jerry Zhang
a82ea6a91f [quant][graphmode][fx][fix] Support None qconfig in convert (#44524)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44524

None qconfig is not handled previously
closes: https://github.com/pytorch/pytorch/issues/44438

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23640269

fbshipit-source-id: 8bfa88c8c78d4530338d9d7fa9669876c386d91f
2020-09-11 15:22:25 -07:00
Vasiliy Kuznetsov
70dfeb44bd MinMax based observers: respect device affinity for state_dict (#44537)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44537

Originally, the `min_val`, `max_val`, `min_vals`, `max_vals`
attributes of observers were Tensors but not buffers.  They had custom
state_dict save/load code to ensure their state was saved.

At some point, these attributes became buffers, and the custom
save/load code remained. This introduced a subtle bug:
* create model A, move it to a device (cpu/cuda) and save its state_dict
* create model B, load its state dict.
* `min_val|min_vals|max_val|max_vals` would always be loaded to model A's device, even if the rest of model B was on a different device
* the above is inconsistent with how save/load on different devices is expected to work (see https://pytorch.org/tutorials/beginner/saving_loading_models.html#saving-loading-model-across-devices)

In practice, the case people would sometimes hit is:
* model A is on CPU, state dict is saved
* model B is created and moved to GPU, state_dict from model A is loaded
* assertions throw when operations are attempted across different devices

This PR fixes the behavior by removing the custom save/load where
possible and letting the default `nn.Module` save/load code handle
device assignment.  We special case `PerChannelMinMaxObserver` and its
children to allow for loading buffers or different size, which is
normal.

There are some followups to also enable this for HistogramObserver
and FakeQuantize, which can be done in separate PRs due to higher
complexity.

Test Plan:
```
python test/test_quantization.py TestObserver.test_state_dict_respects_device_affinity
```

Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D23644493

fbshipit-source-id: 0dbb6aa309ad569a91a663b9ee7e44644080032e
2020-09-11 14:48:56 -07:00
Jerry Zhang
11fb51d093 [quant][graphmode][fx][fix] Support dictionary output (#44508)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44508

Bug fix for dictionary output

Test Plan: Imported from OSS

Reviewed By: z-a-f

Differential Revision: D23636182

fbshipit-source-id: 0c00cd6b9747fa3f8702d7f7a0d5edb31265f466
2020-09-11 11:29:20 -07:00
Jerry Zhang
0c58a017bd [quant][eagermode][refactor] Add set/get method for quantization and fusion mappings (#43990)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43990

Allow user to register custom quantization and fusion patterns

Test Plan: Imported from OSS

Reviewed By: z-a-f

Differential Revision: D23485344

fbshipit-source-id: 4f0174ee6d8000d83de0f73cb370e9a1941d54aa
2020-09-10 21:29:39 -07:00
Supriya Rao
646ffd4886 [quant] Move EmbeddingBag eager quantization to static (#44217)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44217

Move the tests to static ones as well

Test Plan:
python test/test_quantization.py TestStaticQuantizedModule.test_embedding_bag_api

Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D23547386

fbshipit-source-id: 41f81c31e1613098ecf6a7eff601c7dcd4b09c76
2020-09-08 19:05:02 -07:00
Supriya Rao
57b87aaf59 [quant] Add quantized Embedding module (#44208)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44208

Add quantized module in static quantization namespace. Embedding
quantization requires only weights to be quantized so it is static.
Internally it calls the embedding_bag_byte op with the offsets set corresponding to the
indices.

Future PR will move EmbeddingBag quantization from dynamic to static as well.

Test Plan:
python test/test_quantization.py test_embedding_api

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23547384

fbshipit-source-id: eddc6fb144b4a771060e7bab5853656ccb4443f0
2020-09-08 19:04:59 -07:00
Jerry Zhang
6269b6e0f0 [quant][graphmode][fx][api] Call fuse in prepare (#43984)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43984

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23459261

fbshipit-source-id: 6b56b0916d76df67b9cc2f4be1fcee905d604019
2020-09-08 18:09:26 -07:00
Jerry Zhang
9f54bcc522 [quant][graphmode][fx] Support inplace option (#43983)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43983

Support inplace option in apis

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23459260

fbshipit-source-id: 80409c7984f17d1a4e13fb1eece8e18a69ee43b3
2020-09-08 17:39:13 -07:00
Vasiliy Kuznetsov
00b5bd536f fx quant: add docblocks to _find_matches and _find_quants (#43928)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43928

Improving readability, no logic change.

Test Plan:
CI

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D23440249

fbshipit-source-id: a7ebfc7ad15c73e26b9a94758e7254413cc17d29
2020-09-08 16:13:11 -07:00
Jerry Zhang
43e38d60d6 [quant][graphmode][fx] Support quantize per channel in all cases (#44042)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44042

Missed one case last time

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23479345

fbshipit-source-id: 30e6713120c494e9fab5584de4df9b25bec83d32
2020-09-08 15:45:14 -07:00
Vasiliy Kuznetsov
fd8e2064e0 quant: switch observers to use min_max (#42957)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42957

Switches observers to use the new min_max function to calculate
min and max at the same time.  We see around 45-50% speedup on
representative input shapes on the microbenchmarks for all observers except `HistogramObserver`.

Test Plan:
CI for correctness

performance:
```
cd benchmarks/operator_benchmark
// repeat (before diff, after diff) x (cpu, cuda)
python -m pt.qobserver_test --tag_filter all --device cpu
/*
    * before, cpu: https://our.intern.facebook.com/intern/paste/P138633280/
    * before, cuda: https://our.intern.facebook.com/intern/paste/P138639473/
    * after, cpu: https://our.intern.facebook.com/intern/paste/P138635458/
    * after, cuda: https://our.intern.facebook.com/intern/paste/P138636344/
*/
```

Imported from OSS

Reviewed By: supriyar

Differential Revision: D23093995

fbshipit-source-id: 9f416d144109b5b80baf089eb4bcfabe8fe358d5
2020-09-08 11:39:44 -07:00
Vasiliy Kuznetsov
618b4dd763 fx quant prepare: clarify naming (#44125)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44125

In `Quantizer._prepare`, `observed` was used for two different variables
with different types.  Making the names a bit cleaner and removing the
name conflict.

Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
```

Imported from OSS

Reviewed By: dskhudia

Differential Revision: D23504109

fbshipit-source-id: 0f73eac3d6dd5f72ad5574a4d47d33808a70174a
2020-09-04 21:29:56 -07:00
Zachary DeVito
2ad5a82c43 [fx] get rid of graph_module.root (#44092)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44092

instead submodules and weights are installed directly on the
graph_module by transferring the original modules. This makes it more
likely that scripting will succeed (since we no longer have submodules
that are not used in the trace). It also prevents layered transforms
from having to special case handling of the `root` module. GraphModules
can now be re-traced as part of the input to other transforms.

Test Plan: Imported from OSS

Reviewed By: jamesr66a

Differential Revision: D23504210

Pulled By: zdevito

fbshipit-source-id: f79e5c4cbfc52eb0ffb5d6ed89b37ce35a7dc467
2020-09-04 11:35:32 -07:00
Vinod Kumar S
2a1fc56694 replace the white list from default mappings (#41802)
Summary:
Replaced "whitelist" from default_mappings.py
Fixes https://github.com/pytorch/pytorch/issues/41756

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41802

Reviewed By: ngimel

Differential Revision: D23521452

Pulled By: malfet

fbshipit-source-id: 019a2d5c06dc59dc53d6c48b70fb35b216299cf4
2020-09-04 10:04:28 -07:00
Vasiliy Kuznetsov
71510c60ad fx qat: respect device affinity (#44115)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44115

Fixes device affinity in the FX prepare pass for QAT. Before this PR, observers
were always created on CPU. After this PR, observers are created on the
same device as the rest of the model. This will enable QAT prepare to
work regardless of whether users move the model to cuda before or after
calling this pass.

Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_qat_prepare_device_affinity
```

Imported from OSS

Reviewed By: supriyar

Differential Revision: D23502291

fbshipit-source-id: ec4ed20c21748a56a25e3395b35ab8640d71b5a8
2020-09-03 16:16:59 -07:00
Meghan Lele
7816d53798 [JIT] Add mypy type annotations for JIT (#43862)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43862

Test Plan: Imported from OSS

Reviewed By: eellison

Differential Revision: D23491151

Pulled By: SplitInfinity

fbshipit-source-id: 88367b89896cf409bb9ac3db7490d6779efdc3a4
2020-09-03 15:09:24 -07:00
Vasiliy Kuznetsov
f9efcb646b fx quant: clarify state in Quantizer object (#43927)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43927

Adds uninitialized placeholders for various state
used throughout the Quantizer object, with documentation
on what they are. No logic change.

Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestQuantizeFx
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D23439473

fbshipit-source-id: d4ae83331cf20d81a7f974f88664ccddca063ffc
2020-09-02 16:34:00 -07:00
Vasiliy Kuznetsov
df8da5cb5a fx quant: make load_arg function more clear (#43923)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43923

Readability improvements to `Quantizer.convert.load_arg`, makes
things easier to read.
1. add docblock
2. `arg` -> `arg_or_args`, to match what's actually happening
3. `loaded_arg` -> `loaded_args`, to match what's actually happening

Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestQuantizeFx
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D23438745

fbshipit-source-id: f886b324d2e2e33458b72381499e37dccfc3bd30
2020-09-02 09:06:05 -07:00
Vasiliy Kuznetsov
77ef77e5fa fx quant: rename matches -> is_match (#43914)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43914

Renames `matches` function to `is_match`, since there is also
a list named `matches` we are passing around in `Quantizer`,
and would be good to decrease name conflicts.

Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D23435601

fbshipit-source-id: 394af11e0120cfb07dedc79d5219247330d4dfd6
2020-09-02 09:06:01 -07:00
Vasiliy Kuznetsov
6f5282adc8 add quantization debug util to pretty print FX graphs (#43910)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43910

Adds a debug function to get a representation of all nodes in the
graph, such as

```
name          op      target         args               kwargs
x             plchdr  x              ()                 {}
linear_weight gt_prm  linear.weight  ()                 {}
add_1         cl_fun  <bi_fun add>   (x, linear_weight) {}
linear_1      cl_mod  linear         (add_1,)           {}
relu_1        cl_meth relu           (linear_1,)        {}
sum_1         cl_fun  <bi_meth sum>  (relu_1,)          {'dim': -1}
topk_1        cl_fun  <bi_meth topk> (sum_1, 3)         {}
```

using only Python STL. This is useful for printing internal state of
graphs when working on FX code.

Has some on-by-default logic to shorten things so that node reprs for
toy models and unit tests fit into 80 chars.

Flexible on function name and location, I care more that this is
accessible from both inside PT as well as from debug scripts which
are not checked in.

Test Plan:
see
https://gist.github.com/vkuzo/ed0a50e5d6dc7442668b03bb417bd603 for
example usage

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D23435029

fbshipit-source-id: 1a2df797156a19cedd705e9e700ba7098b5a1376
2020-09-02 09:04:44 -07:00
Jerry Zhang
8fd9fe93be [quant][graphmode][fx] Support dynamic quantization without calibration (#43952)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43952

Run weight observer for dynamic quantization before inserting quant/dequant node

Test Plan: Imported from OSS

Reviewed By: supriyar

Differential Revision: D23452123

fbshipit-source-id: c322808fa8025bbadba36c2e5ab89f59e85de468
2020-09-01 19:09:48 -07:00
Jerry Zhang
0ffe3d84d5 [quant][graphmode][fx] Support dynamic quantization without calibration (#43892)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43892

Run weight observer in the convert function, so user do not need to run calibration

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23429758

fbshipit-source-id: 5bc222e3b731789ff7a86463c449690a58dffb7b
2020-09-01 17:01:48 -07:00
Jerry Zhang
d15b9d980c [quant][graphmode][fx][refactor] Move patterns to separate files (#43891)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43891

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23429759

fbshipit-source-id: f19add96beb7c8bac323ad78f74588ca1393040c
2020-09-01 16:37:33 -07:00
Jerry Zhang
825c109eb7 [reland][quant][graphmode][fx] Add support for weight prepack folding (#43728) (#43902)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43902

Trace back from the weight node util we hit getattr, reconstruct the graph module with the traced nodes
and run the graph module to pack the weight. then replace the original chain of ops with the packed weight.

Test Plan:
Imported from OSS

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23432431

fbshipit-source-id: 657f21a8287494f7f87687a9d618ca46376d3aa3
2020-09-01 00:26:19 -07:00
Jerry Zhang
7db7da7151 [reland][quant][graphmode][fx] Add top level APIs (#43581) (#43901)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43901

Add similar APIs like eager and graph mode on torchscript
- fuse_fx
- quantize_fx (for both post training static and qat)
- quantize_dynamic_fx (for post training dynamic)
- prepare_fx (for both post training static and qat)
- prepare_dynamic_fx (for post training dynamic)
- convert_fx (for all modes)

Test Plan:
Imported from OSS

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23432430

fbshipit-source-id: fc99eb75cbecd6ee7a3aa6c8ec71cd499ff7e3c1
2020-08-31 18:24:26 -07:00
Qi Zhou
f73ba88946 Avoid resizing in MinMaxObserver (#43789)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43789

Since it's single element.. In some cases we may not be able to resize the
buffers.

Test Plan: unit tests

Reviewed By: supriyar

Differential Revision: D23393108

fbshipit-source-id: 46cd7f73ed42a05093662213978a01ee726433eb
2020-08-31 17:41:39 -07:00
Alban Desmaison
f7bae5b6b1 Revert D23385091: [quant][graphmode][fx] Add top level APIs
Test Plan: revert-hammer

Differential Revision:
D23385091 (eb4199b0a7)

Original commit changeset: b789e54e1a0f

fbshipit-source-id: dc3dd9169d34beab92488d78d42d7e7d05e771d1
2020-08-31 12:18:29 -07:00
Alban Desmaison
68304c527a Revert D23385090: [quant][graphmode][fx] Add support for weight prepack folding
Test Plan: revert-hammer

Differential Revision:
D23385090 (ef08f92076)

Original commit changeset: 11341f0af525

fbshipit-source-id: fe2bcdc16106923a2cee99eb5cc0a1e9c14ad2c5
2020-08-31 12:17:28 -07:00
Jerry Zhang
ef08f92076 [quant][graphmode][fx] Add support for weight prepack folding (#43728)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43728

Trace back from the weight node util we hit getattr, reconstruct the graph module with the traced nodes
and run the graph module to pack the weight. then replace the original chain of ops with the packed weight.

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23385090

fbshipit-source-id: 11341f0af525a02ecec36f163a9cd35dee3744a1
2020-08-31 10:35:11 -07:00
Jerry Zhang
eb4199b0a7 [quant][graphmode][fx] Add top level APIs (#43581)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43581

Add similar APIs like eager and graph mode on torchscript
- fuse_fx
- quantize_fx (for both post training static and qat)
- quantize_dynamic_fx (for post training dynamic)
- prepare_fx (for both post training static and qat)
- prepare_dynamic_fx (for post training dynamic)
- convert_fx (for all modes)

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23385091

fbshipit-source-id: b789e54e1a0f3af6b026fd568281984e253e0433
2020-08-31 10:12:55 -07:00
Jerry Zhang
b8d34547ee [quant][graphmode][fx][fix] enable per channel quantization for functional ops (#43534)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43534

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23310857

fbshipit-source-id: ff7a681ee55bcc51f564e9de78319249b989366c
2020-08-31 09:35:25 -07:00
Yizhou Yu
8997a4b56b [typing] Enable typing in torch.quantization.fuse_modules typechecks … (#43786)
Summary:
…during CI

Fixes #{42971}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43786

Reviewed By: malfet

Differential Revision: D23403258

Pulled By: yizhouyu

fbshipit-source-id: 4cd24a4fcf1408341a210fa50f574887b6db5e0e
2020-08-28 20:42:23 -07:00
Dmytro Dzhulgakov
633d239409 [torch.fx] Pass placeholders through delegate too (#43432)
Summary:
It's useful if we add additional attributed to nodes in the graph - it's easier to set the attribute on all nodes, even if the value would happen to be None.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43432

Reviewed By: jamesr66a

Differential Revision: D23276433

Pulled By: dzhulgakov

fbshipit-source-id: c69e7cb723bbbb4dba3b508a3d6c0e456fe610df
2020-08-28 18:07:52 -07:00
maxosen64
1f7434d1ea Fix 'module' to 'model' in quantize_dynamic doc (#43693)
Summary:
Fixes issue https://github.com/pytorch/pytorch/issues/43503

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43693

Reviewed By: malfet

Differential Revision: D23397641

Pulled By: mrshenli

fbshipit-source-id: bc216cea4f0a30c035e84a6cfebabd3755ef1305
2020-08-28 10:44:43 -07:00
Jerry Zhang
5a1aa0e21e [reland][quant][graphmode][fx] Add e2e test on torchvision (#43587)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43587

Add tests for graph mode quantization on torchvision and make sure it matches
current eager mode quantization

Test Plan:
Imported from OSS

Imported from OSS

Reviewed By: z-a-f

Differential Revision: D23331253

fbshipit-source-id: 0445a44145d99837a2c975684cd0a0b7d965c8f9
2020-08-27 10:12:07 -07:00
Ralf Gommers
573940f8d7 Fix type annotation errors in torch.functional (#43446)
Summary:
Closes gh-42968

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43446

Reviewed By: albanD

Differential Revision: D23280962

Pulled By: malfet

fbshipit-source-id: de5386a95a20ecc814c39cbec3e4252112340b3a
2020-08-26 08:27:59 -07:00
Mikhail Zolotukhin
be637fd5f6 Revert D23306683: [quant][graphmode][fx] Testing torchvision
Test Plan: revert-hammer

Differential Revision:
D23306683 (62dcd253e3)

Original commit changeset: 30d27e225d45

fbshipit-source-id: e661334d187d3d6756facd36f2ebdb3ab2cd2e26
2020-08-25 15:24:02 -07:00
Jerry Zhang
62dcd253e3 [quant][graphmode][fx] Testing torchvision (#43526)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43526

Add tests for graph mode quantization on torchvision and make sure it matches
current eager mode quantization

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23306683

fbshipit-source-id: 30d27e225d4557bfc1d9aa462086e416aa9a9c0e
2020-08-25 13:02:14 -07:00
Supriya Rao
284ff04792 [quant] Support set API for EmbeddingBag quantization (#43433)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43433

Add support for torch.quint8 dtype

Test Plan: Imported from OSS

Reviewed By: radkris-git

Differential Revision: D23277002

fbshipit-source-id: 4204bc62f124b4fd481aaa6aa47b9437978c43ee
2020-08-24 14:33:35 -07:00
Jerry Zhang
ec9e6e07bc [quant][graphmode][fx] Add support for general value ops (#43439)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43439

Porting op tests from test_quantize_jit.py

Test Plan:
TestQuantizeFxOps

Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D23278585

fbshipit-source-id: ad29f39482cf4909068ce29555470ef430ea17f6
2020-08-22 08:52:28 -07:00
Jerry Zhang
88b564ce39 [quant][graphmode][fx] Add support for general shape ops (#43438)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43438

Porting op tests from test_quantize_jit.py

Test Plan:
TestQuantizeFxOps

Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D23278583

fbshipit-source-id: 34b73390d47c7ce60528444da77c4096432ea2cb
2020-08-21 23:07:20 -07:00
Jerry Zhang
192c4b0050 [quant][graphmode][fx] Add support for clamp (#43437)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43437

Porting op tests from test_quantize_jit.py

Test Plan:
TestQuantizeFxOps

Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D23278584

fbshipit-source-id: 266dc68c9ca30d9160a1dacf28dc7781b3d472c2
2020-08-21 20:21:50 -07:00
Jerry Zhang
490d41aaa6 [quant][graphmode][fx] Add support for instance_norm (#43377)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43377

Porting op tests from test_quantize_jit.py

Test Plan:
TestQuantizeFxOps

Imported from OSS

Reviewed By: z-a-f

Differential Revision: D23257045

fbshipit-source-id: 7f4ad5d81f21bf0b8b9d960b054b20dc889e6c3b
2020-08-21 18:32:50 -07:00
Jerry Zhang
aec917a408 [quant][graphmode][fx] Add support for layer_norm (#43376)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43376

Porting op tests from test_quantize_jit.py

Test Plan:
TestQuantizeFxOps

Imported from OSS

Reviewed By: z-a-f

Differential Revision: D23257048

fbshipit-source-id: 47a04a5221bcaf930d574f879d515e3dff2d1f6d
2020-08-21 16:38:16 -07:00
Jerry Zhang
089bb1a8e4 [quant][graphmode][fx] Add support for elu (#43375)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43375

Porting op tests from test_quantize_jit.py

Test Plan:
TestQuantizeFxOps

Imported from OSS

Reviewed By: z-a-f

Differential Revision: D23257043

fbshipit-source-id: 22360610d87ef98d25871daff3fdc3dbb3ec5bdb
2020-08-21 16:07:36 -07:00
Jerry Zhang
5a02c6b158 [quant][graphmode][fx] Add support for hardswish (#43374)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43374

Porting op tests from test_quantize_jit.py

Test Plan:
TestQuantizeFxOps

Imported from OSS

Reviewed By: z-a-f

Differential Revision: D23257044

fbshipit-source-id: 2cdf12e104db6e51ffa0324eb602e68132a646ef
2020-08-21 16:06:32 -07:00
Jerry Zhang
109ea59afc [quant][graphmode][fx] Add support for batchnorm relu (#43335)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43335

Porting op tests from test_quantize_jit.py

Test Plan:
TestQuantizeFxOps

Imported from OSS

Reviewed By: z-a-f

Differential Revision: D23243563

fbshipit-source-id: 3c562f519b90e0157761a00c89eca63af8b909f2
2020-08-21 14:32:51 -07:00
Jerry Zhang
9e87a8ddf4 [quant][graphmode][fx] Add support for batchnorm (#43334)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43334

Porting op tests from test_quantize_jit.py

Test Plan:
TestQuantizeFxOps

Imported from OSS

Reviewed By: z-a-f

Differential Revision: D23243560

fbshipit-source-id: 0a7bc331293bbc3db85616bf43a995d3b112beb6
2020-08-21 14:31:49 -07:00
Jerry Zhang
650590da0d [quant][graphmode][fx] Add support for conv module + relu (#43287)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43287

Porting op tests from test_quantize_jit.py

Test Plan:
TestQuantizeFxOps

Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D23221735

fbshipit-source-id: 2513892a1928f92c09d7e9a24b2ea12b00de218d
2020-08-21 12:13:02 -07:00
Supriya Rao
3293fdfa80 [quant] Enable from_float for quantized Embedding_Bag (#43176)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43176

Convert floating point nn.EmbeddingBag module to
nn.quantized.dynamic.EmbeddingBag module

Test Plan:
python test/test_quantization.py TestDynamicQuantizedModule.test_embedding_bag_api
python test/test_quantization.py TestPostTrainingDynamic.test_embedding_quantization

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23200196

fbshipit-source-id: 090f47dbf7aceab9c719cbf282fad20fe3e5a983
2020-08-21 11:46:03 -07:00
Edmund Williams Jr
17f9edda42 Bias Correction Implementation (#41845)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41845

Test Plan: Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D22661503

Pulled By: edmundw314

fbshipit-source-id: a88c349c6cc15b1c66aa6dee7593ef3df588eb85
2020-08-20 21:40:33 -07:00
Jerry Zhang
217ddea93a [quant] Make OP_LIST_TO_FUSER_METHOD public (#43286)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43286

We need to use this in graph mode quantization on fx

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23221734

fbshipit-source-id: 7c3c3840ce5bdc185b962e081aff1618f4c58e85
2020-08-20 20:19:13 -07:00
Jerry Zhang
9984d33542 [quant][graphmode][fx] Add support for conv module (#43285)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43285

Porting op tests from test_quantize_jit.py

(Note: this ignores all push blocking failures!)

Test Plan:
TestQuantizeFxOps

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23221733

fbshipit-source-id: c1f0f7ae0c82379143aa33fc1af7284d8303174b
2020-08-20 19:53:30 -07:00
Jerry Zhang
b0ec336477 [quant][graphmode][fx][test] Add per op test for graph mode quant on fx (#43229)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43229

Test Plan: Imported from OSS

Reviewed By: supriyar

Differential Revision: D23201692

fbshipit-source-id: 37fa54dcf0a9d5029f1101e11bfd4ca45b422641
2020-08-20 17:32:02 -07:00
Jerry Zhang
dae2973fae [quant][graphmode][fx] Add graph mode quantization on fx (#43175)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43175

This PR added graph mode quantization on fx: https://github.com/pytorch/pytorch/pull/42741
Currently it matches eager mode quantization for torchvision with static/dynamic/qat
ddp/synbn test is still wip

Test Plan:
python test/test_quantization.py TestQuantizeFx

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23178602

fbshipit-source-id: 8e7e0322846fbda2cfa79ad188abd7235326f879
2020-08-20 14:50:09 -07:00
Vasiliy Kuznetsov
57af1ec145 observers: use torch.all to check for valid min and max values (#43151)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43151

Using `torch.all` instead of `torch.sum` and length check.
It's unclear whether the increase in perf (~5% for small inputs) is
real, but should be a net benefit, especially for larger channel inputs.

Test Plan: Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D23170426

fbshipit-source-id: ee5c25eb93cee1430661128ac9458a9c525df8e5
2020-08-17 17:08:57 -07:00
Vasiliy Kuznetsov
3264ba065c observers: use clamp instead of min/max in calculate_qparams (#43150)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43150

The current logic was expensive because it created tensors on CUDA.
Switching to clamp since it can work without needing to create tensors.

Test Plan:
benchmarks

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D23170427

fbshipit-source-id: 6fe3a728e737aca9f6c2c4d518c6376738577e21
2020-08-17 17:08:54 -07:00
Vasiliy Kuznetsov
a5dfba0a6e observers: make eps a buffer (#43149)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43149

This value doesn't change, making it a buffer to only pay
the cost of creating a tensor once.

Test Plan: Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D23170428

fbshipit-source-id: 6b963951a573efcc5b5a57649c814590b448dd72
2020-08-17 17:08:51 -07:00
Paul Shao
b992a927a9 Clearer Semantics and Naming for Customized Quantization Range Initialization in Observer (#42602)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42602

In this diff, clearer semantics and namings for are introduced by splitting the original `init_dynamic_qrange` into 2 separate `Optional[int]` types `qmin` and `qmax` to avoid the confusion of the parameters with dynamic quantization.

The `qmin` and `qmax` parameters allow customers to specify their own customary quantization range and enables specific use cases for lower bit quantization.

Test Plan:
To assert the correctness and compatibility of the changes with existing observers, on a devvm, execute the following command to run the unit tests:

`buck test //caffe2/test:quantization -- observer`

Reviewed By: vkuzo, raghuramank100

Differential Revision: D22948334

fbshipit-source-id: 275bc8c9b5db4ba76fc2e79ed938376ea4f5a37c
2020-08-13 21:15:23 -07:00
Jerry Zhang
a55b7e2a6d [reland][quant][fix] Remove activation_post_process in qat modules (#42343) (#43015)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43015

Currently activation_post_process are inserted by default in qat modules, which is not
friendly to automatic quantization tools, this PR removes them.

Test Plan:
Imported from OSS

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23105059

fbshipit-source-id: 3439ac39e718ffb0390468163bcbffd384802b57
2020-08-13 20:44:14 -07:00
Jerry Zhang
85752b989d [quant][doc] Print more info for fake quantize module (#43031)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43031

fixes: https://github.com/pytorch/pytorch/issues/43023

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23116200

fbshipit-source-id: faa90ce8711da0785d635aacd0362c45717cfacc
2020-08-13 20:27:36 -07:00
Supriya Rao
816d37b1d8 [quant] Make PerChannel Observer work with float qparams (#42690)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42690

Add implementation for new qscheme per_channel_affine_float_qparams in observer

Test Plan:
python test/test_quantization.py TestObserver.test_per_channel_observers

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23070633

fbshipit-source-id: 84d348b0ad91e9214770131a72f7adfd3970349c
2020-08-13 11:22:19 -07:00
Richard Zou
607e49cc83 Revert D22856816: [quant][fix] Remove activation_post_process in qat modules
Test Plan: revert-hammer

Differential Revision:
D22856816 (8cb42fce17)

Original commit changeset: 988a43bce46a

fbshipit-source-id: eff5b9abdfc15b21c02c61eefbda38d349173436
2020-08-13 07:22:20 -07:00
Jerry Zhang
8cb42fce17 [quant][fix] Remove activation_post_process in qat modules (#42343)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42343

Currently activation_post_process are inserted by default in qat modules, which is not
friendly to automatic quantization tools, this PR removes them.

Test Plan: Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D22856816

fbshipit-source-id: 988a43bce46a992b38fd0d469929f89e5b046131
2020-08-12 20:14:23 -07:00
Jerry Zhang
ac93d45906 [quant] Attach qconfig to all modules (#42576)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42576

Previously we have qconfig propagate list and we only attach qconfig for modules
in the list, this works when everything is quantized in the form of module.
but now we are expanding quantization for functional/torch ops, we'll need to attach qconfig
to all modules

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D22939453

fbshipit-source-id: 7d6a1f73ff9bfe461b3afc75aa266fcc8f7db517
2020-08-11 20:34:34 -07:00
Mike Ruberry
b7a9bc0802 Revert D22217029: Add fake quantize operator that works in backward pass
Test Plan: revert-hammer

Differential Revision:
D22217029 (48e978ba18)

Original commit changeset: 7055a2cdafcf

fbshipit-source-id: f57a27be412c6fbfd5a5b07a26f758ac36be3b67
2020-08-07 23:04:40 -07:00
Presley Graham
48e978ba18 Add fake quantize operator that works in backward pass (#40532)
Summary:
This diff adds FakeQuantizeWithBackward. This works the same way as the regular FakeQuantize module, allowing QAT to occur in the forward pass, except it has an additional quantize_backward parameter. When quantize_backward is enabled, the gradients are fake quantized as well (dynamically, using hard-coded values). This allows the user to see whether there would be a significant loss of accuracy if the gradients were quantized in their model.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/40532

Test Plan: The relevant test for this can be run using `python test/test_quantization.py TestQATBackward.test_forward_and_backward`

Reviewed By: supriyar

Differential Revision: D22217029

Pulled By: durumu

fbshipit-source-id: 7055a2cdafcf022f1ea11c3442721ae146d2b3f2
2020-08-07 17:47:01 -07:00
Presley Graham
7332c21f7a Speed up HistogramObserver by vectorizing critical path (#41041)
Summary:
22x speedup over the code this replaces. Tested on ResNet18 on a devvm using CPU only, using default parameters for HistogramObserver (i.e. 2048 bins).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41041

Test Plan:
To run the test against the reference (old) implementation, you can use `python test/test_quantization.py TestRecordHistogramObserver.test_histogram_observer_against_reference`.

To run the benchmark, while in the folder `benchmarks/operator_benchmark`, you can use `python -m benchmark_all_quantized_test --operators HistogramObserverCalculateQparams`.

Benchmark results before speedup:
```
# ----------------------------------------
# PyTorch/Caffe2 Operator Micro-benchmarks
# ----------------------------------------
# Tag : short

# Benchmarking PyTorch: HistogramObserverCalculateQparams
# Mode: Eager
# Name: HistogramObserverCalculateQparams_C3_M512_N512_dtypetorch.quint8_cpu_qschemetorch.per_tensor_affine
# Input: C: 3, M: 512, N: 512, dtype: torch.quint8, device: cpu, qscheme: torch.per_tensor_affine
Forward Execution Time (us) : 185818.566

# Benchmarking PyTorch: HistogramObserverCalculateQparams
# Mode: Eager
# Name: HistogramObserverCalculateQparams_C3_M512_N512_dtypetorch.quint8_cpu_qschemetorch.per_tensor_symmetric
# Input: C: 3, M: 512, N: 512, dtype: torch.quint8, device: cpu, qscheme: torch.per_tensor_symmetric
Forward Execution Time (us) : 165325.916
```

Benchmark results after speedup:
```
# ----------------------------------------
# PyTorch/Caffe2 Operator Micro-benchmarks
# ----------------------------------------
# Tag : short

# Benchmarking PyTorch: HistogramObserverCalculateQparams
# Mode: Eager
# Name: HistogramObserverCalculateQparams_C3_M512_N512_dtypetorch.quint8_cpu_qschemetorch.per_tensor_affine
# Input: C: 3, M: 512, N: 512, dtype: torch.quint8, device: cpu, qscheme: torch.per_tensor_affine
Forward Execution Time (us) : 12242.241

# Benchmarking PyTorch: HistogramObserverCalculateQparams
# Mode: Eager
# Name: HistogramObserverCalculateQparams_C3_M512_N512_dtypetorch.quint8_cpu_qschemetorch.per_tensor_symmetric
# Input: C: 3, M: 512, N: 512, dtype: torch.quint8, device: cpu, qscheme: torch.per_tensor_symmetric
Forward Execution Time (us) : 12655.354
```

Reviewed By: raghuramank100

Differential Revision: D22400755

Pulled By: durumu

fbshipit-source-id: 639ac796a554710a33c8a930c1feae95a1148718
2020-08-07 12:29:23 -07:00
Jerry Zhang
c3236b6649 [quant] Expose register activation post process hook function to user (#42342)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42342

Test Plan: Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D22856711

fbshipit-source-id: d6ad080c82b744ae1147a656c321c448ac5e7f10
2020-08-03 12:28:42 -07:00
Supriya Rao
38bf5be24f [quant] Use PlaceholderObserver instead of Fp16Observer and NoopObserver (#42348)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42348

Use the dtype info in placeholderObserver to decide what ops to insert in the graph
In the next PR we can delete NoopObserver

Test Plan:
python test/test_quantization.py

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D22859457

fbshipit-source-id: a5c618f22315534ebd9a2df77b14a0aece196989
2020-07-31 12:33:56 -07:00
Supriya Rao
6bd46b583e [quant][graph] Add support for FP16 dynamic quant (#42222)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42222

This change adds the necessary passes to perform FP16 dynamic quantization.
We skip inserting observers for activations based on the dtype (torch.float16) and only insert the Fp16Observer for weights

Test Plan:
python test/test_quantization.py TestQuantizeJitOps

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D22849220

fbshipit-source-id: 2c53594ecd2485e9e3dd0b380eceaf7c5ab5fc50
2020-07-31 12:33:53 -07:00
Supriya Rao
8c5bf10264 [quant] Add FP16Observer for fp16 quant support (#42221)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42221

Adds a new observer that emits a warning if the range of tensor is beyond fp16 range. This will be further used in graph mode quantization to insert the cast to fp16 ops in the graph

Test Plan:
python test/test_quantizaton.py TestObserver.test_fp16_observer

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D22849222

fbshipit-source-id: a301281ce38ba4d4e7a009308400d34a08c113d2
2020-07-31 12:33:51 -07:00
Paul Shao
382781221d Extending Learnable Fake Quantize module to support gradient scaling and factory (partial) construction (#41969)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41969

In this diff, the `_LearnableFakeQuantize` module is extended to provide support for gradient scaling where the gradients for both scale and zero point are multiplied by a constant `g` (in some cases, can help with quicker convergence). In addition, it is also augmented to provide a factory method via `_with_args` such that a partial constructor of the module can be built.

Test Plan:
For correctness of the fake quantizer operators, on a devvm, enter the following command:
```
buck test //caffe2/torch:quantization -- learnable_py_module
```

Reviewed By: z-a-f

Differential Revision: D22715629

fbshipit-source-id: ff8e5764f81ca7264bf9333789f57e0b0cec7a72
2020-07-29 10:22:26 -07:00
Paul Shao
5a6d88d503 Updates to Scale and Zero Point Gradient Calculation (#42034)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42034

In this diff, scale and zero point gradient calculations are updated to correctly reflect the actual backpropagation equation (instead of `dScale * dX`, the near-final output should be `dScale * dY`; the same applies to zero point).

Test Plan:
To execute the unit tests for all affected learnable fake quantize modules and kernels, on a devvm, execute the following command:

`buck test //caffe2/test:quantization -- learnable`

To enable the `cuda` tests, execute the following command:

`buck test mode/dev-nosan //caffe2/test:quantization -- learnable`

Reviewed By: jerryzh168

Differential Revision: D22735668

fbshipit-source-id: 45c1e0fd38cbb2d8d5e60be4711e1e989e9743b4
2020-07-27 11:18:49 -07:00
Paul Shao
c261a894d1 Updates to Python Module for Calculation of dX and Addition of Unit Tests (#42033)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42033

In this diff, the Python `_LearnableFakeQuantize` module is updated where the gradient with respect to the input `x` is actually computed instead of passed through. Argument naming is also updated for better clarity; and unit tests on the `PerTensor` and `PerChannel` operators are added for asserting correctness.

Test Plan:
On a devvm, execute the command:

`buck test //caffe2/test:quantization -- learnable_py_module`

To include `cuda` tests as well, run:

`buck test mode/dev-nosan //caffe2/test:quantization -- learnable_py_module`

Reviewed By: jerryzh168

Differential Revision: D22735580

fbshipit-source-id: 66bea7e9f8cb6422936e653500f917aa597c86de
2020-07-27 11:18:47 -07:00
Haixin Liu
c5b4f60fc2 Move qconfig removal into convert() (#41930)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41930

As title
ghstack-source-id: 108517079

Test Plan: CI

Reviewed By: jerryzh168

Differential Revision: D22698386

fbshipit-source-id: 4f748c9bae4a0b615aa69c7cc8d8e451e5d26863
2020-07-25 13:27:13 -07:00
Edmund Williams Jr
e9e6cc8c83 Added Prehook option to prepare method (#41863)
Summary:
Added a logic so that if a prehook is passed into the prepare method during quantization, then the hook will be added as a prehook to all leaf nodes (and modules specified in the non_leaf_module_list).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41863

Test Plan:
Small demo, made simple module then called prepare with prehook parameter set to the numeric suite logger, printed the results to verify its what we wanted
{F245156246}

Reviewed By: jerryzh168

Differential Revision: D22671288

Pulled By: edmundw314

fbshipit-source-id: ce65a00830ff03360a82c0a075b3b6d8cbc4362e
2020-07-24 10:26:39 -07:00
Supriya Rao
36fb14b68b [quant] Add Graph Mode Passes to quantize EmbeddingBag operators (#41612)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41612

This change adds preliminary support to quantize the EmbeddingBag operators. We currently support 4-bit and 8-bit quantization+packing of the weights.

To quantize these operators, specify the operator name in the `custom_op_name` field of the NoopObserver. Based on the op name (4bit or 8bit) we call the corresponding quantization functions.
Refer to the testplan for how to invoke the qconfig for the embedding_bag ops.

Future versions of this will support 4-bit and 2-bit qtensors with native support to observe and quantize it.

NB - This version assumes that the weights in the EmbeddingBag Module reside on the same device.

Test Plan:
python test/test_quantization.py TestQuantizeDynamicJitOps.test_embedding_bag

Imported from OSS

Reviewed By: vkuzo, jerryzh168

Differential Revision: D22609342

fbshipit-source-id: 23e33f44a451c26719e6e283e87fbf09b584c0e6
2020-07-23 18:54:59 -07:00
Edmund Williams Jr
fd62847eb2 cross_layer_equalization (#41685)
Summary:
The goal is to implement cross layer equalization as described in section 4.1 in this paper: https://arxiv.org/pdf/1906.04721.pdf
Given two adjacent submodules in a trained model, A,B quantization might hurt one of the submodules more than the other. The paper poses the idea that a loss in accuracy from quantizing can be due to a difference in the channel ranges between the two submodules (the output channel range of A can be small, while the input channel range of B can be large). To minimize this source of error, we want to scale the tensors of A,B s.t. their channel ranges are equal (them being equal means no difference in ranges and minimizes this source of error).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41685

Test Plan: Imported from OSS

Reviewed By: z-a-f

Differential Revision: D22630219

Pulled By: edmundw314

fbshipit-source-id: ccc91ba12c10b652d7275222da8b85455b8a7cd5
2020-07-22 08:39:23 -07:00
Paul Shao
9e0c746b15 Augmenting Concrete Observer Constructors to Support Dynamic Quantization Range; Modifying Utility Functions in _LearnableFakeQuantize Module for Better Logging and Baseline Construction. (#41815)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41815

**All are minor changes to enable better simulations.**

The constructors of MinMaxObserver, MovingAverageMinMaxObserver, PerChannelMinMaxObserver, and MovingAveragePerChannelMinMaxObserver are augmented so they can utilize the dynamic quantization range support in the _ObserverBase class.

In addition, minor adjustments are made to the enable_static_observation function that allow observer to update parameters but do not fake quantize on the output (for constructing baseline).

Test Plan:
To ensure this modification is still backward compatible with past usages, numerics are verified by running the quantization unit test suite, which contains various observer tests. The following command executes the test suite, which also verifies the observer numerics:
```
buck test //caffe2/test:quantization -- observer
```

Reviewed By: z-a-f

Differential Revision: D22649128

fbshipit-source-id: 32393b706f9b69579dc2f644fb4859924d1f3773
2020-07-21 17:59:40 -07:00
Paul Shao
5c50cb567c Generalized Learnable Fake Quantizer Module (#41535)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41535

A generalized fake quantization module is built to support lower-bit fake quantization with back propagation on the scale and zero point. The module supports both per tensor and per channel fake quantization.

Test Plan:
Please see diff D22337313 for a related experiment performed on the fake quantizer module.

The `_LearnableFakeQuantize` module supports the following use cases:
- Per Tensor Fake Quantization or Per Channel Fake Quantization
- Static Estimation from Observers or Quantization Parameter Learning through Back Propagation

By default, the module assumes per tensor affine fake quantization. To switch to per channel, during initialization, declare `channel_size` with the appropriate length. To toggle between utilizing static estimation and parameter learning with back propagation, you can invoke the call `enable_param_learning` or `enable_static_estimate`. For more information on the flags that support these operations, please see the doc string of the `_LearnableFakeQuantize` module.

The `_LearnableFakeQuantizer` module relies on 2 operators for its forward and backward paths: `_LearnableFakeQuantizePerTensorOp` and `_LearnableFakeQuantizePerChannelOp`. The backpropagation routine is developed based on the following literature:
- Learned Step Size Quantization: https://openreview.net/pdf?id=rkgO66VKDS
- Trained Quantization Thresholds: https://arxiv.org/pdf/1903.08066.pdf

Reviewed By: z-a-f

Differential Revision: D22573645

fbshipit-source-id: cfd9ece8a959ae31c00d9beb1acf9dfed71a7ea1
2020-07-20 18:24:21 -07:00
Paul Shao
16dde6e3a0 Augmenting Observers to Support Dynamic Quantization Range (#41113)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41113

In this diff, the `ObserverBase` class is augmented with 2 additional optional arguments qmin and qmax. Correspondingly the calculation of qmin and qmax and the related quantization parameters are modified to accommodate this additional flexibility should the number of bits for quantization be lower than 8 (the default value).

Additional logic in the base class `_calculate_qparams` function has also been modified to provide support for dynamic quantization range.

Test Plan:
To ensure this modification is still backward compatible with past usages, numerics are verified by running the quantization unit test suite, which contains various observer tests. The following command executes the test suite, which also verifies the observer numerics:

`buck test //caffe2/test:quantization -- observer`

This modified observer script can be tested within the experiments for lower bit fake quantization. Please see the following diffs for reference.
- Single Fake Quantizer: D22337447
- Single Conv Layer: D22338532

Reviewed By: z-a-f

Differential Revision: D22427134

fbshipit-source-id: f405e633289322078b0f4a417f54b684adff2549
2020-07-20 08:51:31 -07:00
wudenggang
9600ed9af3 typo fixes (#41632)
Summary:
typo fixes

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41632

Reviewed By: ezyang

Differential Revision: D22617827

Pulled By: mrshenli

fbshipit-source-id: c2bfcb7cc36913a8dd32f13fc9adc3aa0a9b682f
2020-07-20 07:23:00 -07:00
emil
0c77bd7c0b Quantization: preserving pre and post forward hooks (#37233)
Summary:
1. While do convert() preserve module's **pre and post forward** hooks
2. While do fusion preserve only module's **pre forward** hooks (because after fusion output no longer the same)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/37233

Differential Revision: D22425141

Pulled By: jerryzh168

fbshipit-source-id: e69b81821d507dcd110d2ff3594ba94b9593c8da
2020-07-13 12:41:24 -07:00
Edward Leardi
733b8c23c4 Fix several quantization documentation typos (#40567)
Summary:
This PR fixes several typos I noticed in the docs here: https://pytorch.org/docs/master/quantization.html. In one case there was a misspelled module [torch.nn.instrinsic.qat](https://pytorch.org/docs/master/quantization.html#torch-nn-instrinsic-qat) which I corrected and am including screenshots of below just in case.

<img width="1094" alt="before" src="https://user-images.githubusercontent.com/54918401/85766765-5cdd6280-b6e5-11ea-93e6-4944cf820b71.png">

<img width="1093" alt="after" src="https://user-images.githubusercontent.com/54918401/85766769-5d75f900-b6e5-11ea-8850-0d1f5ed67b16.png">
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40567

Differential Revision: D22311291

Pulled By: ezyang

fbshipit-source-id: 65d1f3dd043357e38a584d9e30f31634a5b0995c
2020-07-07 09:45:23 -07:00
Haixin Liu
12b5bdc601 Remove unused Logger in get_matching_activations (#41023)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41023

Remove Logger in get_matching_activations since it's not used.
ghstack-source-id: 107237046

Test Plan:
buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_lstm_dynamic'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_lstm_dynamic'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_lstm_dynamic'
buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_conv_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_linear_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_linear_dynamic'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_conv_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_linear_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_submodule_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_functional_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_linear_dynamic'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_conv_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_linear_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_functional_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_linear_dynamic'

Differential Revision: D22394957

fbshipit-source-id: 7d59e0f35e9f4c304b8487460d48236ee6e5a872
2020-07-07 00:33:07 -07:00
Haixin Liu
cab7d94d47 [PyTorch Numeric Suite] Remove unnecessary Logger in input arguments (#40890)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40890

Remove unnecessary Logger in input arguments and simplify the API.
ghstack-source-id: 107110487

Test Plan:
buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_lstm_dynamic'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_lstm_dynamic'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_lstm_dynamic'
buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_conv_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_linear_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_linear_dynamic'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_conv_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_linear_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_submodule_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_functional_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_linear_dynamic'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_conv_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_linear_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_functional_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_linear_dynamic'

Differential Revision: D22345477

fbshipit-source-id: d8b4eb3d6cb3049aa3296dead8ba29bf5467bd1c
2020-07-03 02:45:46 -07:00
Supriya Rao
6aebd2c412 [quant][graphmode] Add FP16 quant support - Insert Noop Observers (#40708)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40708

Insert NoopObservers for activations and weight tensors for FP16

Test Plan:
python test/test_quantization.py test_prepare_dynamic

Imported from OSS

Differential Revision: D22335976

fbshipit-source-id: b19e8035c7db3b0b065ec09c9ad6d913eb434f3e
2020-07-01 14:13:31 -07:00
Haixin Liu
46b9e519aa Remove print (#40475)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40475

As title
ghstack-source-id: 106474870

Test Plan: CI

Differential Revision: D22200640

fbshipit-source-id: 1f4c7bbf54be8c4187c9338fefdf14b501597d98
2020-06-24 00:42:25 -07:00
Jerry Zhang
0e26a03ef9 [quant][graphmode] Enable inplace option for top level API (#40414)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40414

after `_reconstruct` is supported in RecursiveScriptModule: https://github.com/pytorch/pytorch/pull/39979
we can support inplace option in quantization API

Test Plan: Imported from OSS

Differential Revision: D22178326

fbshipit-source-id: c78bc2bcf2c42b06280c12262bb31aebcadc6c32
2020-06-23 16:42:48 -07:00
Vasiliy Kuznetsov
b02c932fb6 qat eager: remove unneeded modules (#40396)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40396

Removes activation and normalization modules from eager mode QAT.
These were incorrectly added, but we don't actually need them.

Test Plan:
```
python test/test_quantization.py TestQuantizationAwareTraining
```

Imported from OSS

Differential Revision: D22169768

fbshipit-source-id: b5bd753dafe92e90e226fb773eb18c6aae179703
2020-06-22 17:45:51 -07:00