Commit Graph

310 Commits

Author SHA1 Message Date
Supriya Rao
489af4ddcb [quant] Add quant APIs to save/load observer state_dict (#44846)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44846

The save function traverses the model state dict to pick out the observer stats
load function traverse the module hierarchy to load the state dict into module attributes depending on observer type

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_save_observer_state_dict

Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D23746821

fbshipit-source-id: 05c571b62949a2833602d736a81924d77e7ade55
2020-09-29 01:52:42 -07:00
James Reed
b0bdc82a00 [FX][EZ] Fix bug where copying node made non-unique name (#45311)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45311

Test Plan: Imported from OSS

Reviewed By: dzhulgakov

Differential Revision: D23917864

Pulled By: jamesr66a

fbshipit-source-id: 10d0a4017ffe160bce4ba0d830e035616bbded74
2020-09-28 22:55:20 -07:00
Jerry Zhang
f93ead6d37 [quant][eagermode] Custom module support (#44835)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44835

This is for feature parity with fx graph mode quantization

Test Plan: Imported from OSS

Reviewed By: z-a-f

Differential Revision: D23745086

fbshipit-source-id: ae2fc86129f9896d5a9039b73006a4da15821307
2020-09-23 15:39:40 -07:00
Jerry Zhang
adb2b380ba [quant][graphmode][fx] qconfig_dict support more types of configurations (#44856)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44856

Support following format of qconfig_dict
```python
qconfig_dict = {
    # optional, global config
    "": qconfig?,

    # optional, used for module and function types
    # could also be split into module_types and function_types if we prefer
    "object_type": [
      (nn.Conv2d, qconfig?),
      (F.add, qconfig?),
      ...,
    ],

    # optional, used for module names
    "module_name": [
      ("foo.bar", qconfig?)
      ...,
    ],

    # optional, matched in order, first match takes precedence
    "module_name_regex": [
      ("foo.*bar.*conv[0-9]+", qconfig?)
      ...,
    ]
    # priority (in increasing order): global, object_type, module_name_regex, module_name
    # qconfig == None means fusion and quantization should be skipped for anything
    # matching the rule
}
```

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23751304

fbshipit-source-id: 5b98f4f823502b12ae2150c93019c7b229c49c50
2020-09-23 13:59:53 -07:00
Supriya Rao
7fba30c2be [quant][fx][bug] Fix error in convert step for QAT (#45050)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45050

Update tests to actually test for QAT

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_linear

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D23808022

fbshipit-source-id: d749ab2d215fe19238ff9d539307ffce9ef0ca9b
2020-09-22 22:48:31 -07:00
Jerry Zhang
f575df201f [quant][graphmode][jit][api] Expose preserved_attrs from finalize to convert_jit (#44490)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44490

Test Plan: Imported from OSS

Reviewed By: z-a-f

Differential Revision: D23631142

fbshipit-source-id: f0913f0cb4576067e2a7288326024942d12e0ae0
2020-09-22 19:37:25 -07:00
Jerry Zhang
ccfbfe5eb5 [quant][graphmode][fx] Custom module support (#44766)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44766

There might be modules that are not symbolically traceable, e.g. LSTM (since it has
input dependent control flows), to support quantization in these cases, user will provide
the corresponding observed and quantized version of the custom module, the observed
custom module with observers already inserted in the module and the quantized version will
have the corresponding ops quantized. And use
```
from torch.quantization import register_observed_custom_module_mapping
from torch.quantization import register_quantized_custom_module_mapping
register_observed_custom_module_mapping(CustomModule, ObservedCustomModule)
register_quantized_custom_module_mapping(CustomModule, QuantizedCustomModule)
```
to register the custom module mappings, we'll also need to define a custom delegate class
for symbolic trace in order to prevent the custom module from being traced:
```python
class CustomDelegate(DefaultDelegate):
      def is_leaf_module(self, m):
          return (m.__module__.startswith('torch.nn') and
                    not isinstance(m, torch.nn.Sequential)) or \
                    isinstance(m, CustomModule)
m = symbolic_trace(original_m, delegate_class=CustomDelegate)
```

Test Plan: Imported from OSS

Reviewed By: z-a-f

Differential Revision: D23723455

fbshipit-source-id: 50d666e29b94cbcbea5fb6bcc73b00cff87eb77a
2020-09-22 17:11:46 -07:00
James Reed
7f4a27be3a [resubmit][FX] s/get_param/get_attr/ (#45147)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45147

ghstack-source-id: 112605923

Test Plan: Imported from OSS

Reviewed By: eellison

Differential Revision: D23845096

fbshipit-source-id: 9ca209aa84cbaddd6e89c52b541e43b11197e2d5
2020-09-22 17:06:18 -07:00
Zafar
2b1f25885e [quant] Fix ConvTranspose mapping (#44844)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44844

Test Plan: Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D23746466

Pulled By: z-a-f

fbshipit-source-id: cb84e0fef5ab82e8ed8dd118d9fb21ee7b480ef7
2020-09-22 11:59:42 -07:00
James Reed
1fd48a9d1f Revert D23798016: [FX] s/get_param/get_attr/
Test Plan: revert-hammer

Differential Revision:
D23798016 (c941dd3492)

Original commit changeset: 1d2f3db1994a

fbshipit-source-id: 974d930064b37d396c5d66c905a63d45449813e5
2020-09-22 10:32:51 -07:00
James Reed
c941dd3492 [FX] s/get_param/get_attr/ (#45000)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45000

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D23798016

Pulled By: jamesr66a

fbshipit-source-id: 1d2f3db1994a62b95d0ced03bf958e54d30c35dd
2020-09-21 14:09:32 -07:00
Vasiliy Kuznetsov
2163d31016 histogram observer: ensure buffer shape consistency (#44956)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44956

Makes buffer shapes for HistogramObserver have the
same shapes in uninitialized versus initialized states.

This is useful because the detectron2 checkpointer assumes
that these states will stay the same, so it removes the
need for manual hacks around the shapes changing.

Test Plan:
```
python test/test_quantization.py TestObserver.test_histogram_observer_consistent_buffer_shape
```

Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D23785382

fbshipit-source-id: 1a83fd4f39b244b00747c368d5d305a07d877c92
2020-09-19 09:29:39 -07:00
Supriya Rao
1fde54d531 [quant][qat] Ensure fake_quant and observer can be disabled on scriptmodule (#44773)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44773

The model is created and prepared using fx APIs and then scripted for training.
In order to test QAT on scriptmodel we need to be able to disable/enable fake_quant
and observer modules on it.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_qat_and_script

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D23741354

fbshipit-source-id: 3fee7aa9b049d9901313b977710f4dc1c4501532
2020-09-17 10:21:52 -07:00
Supriya Rao
361b38da19 [quant][fx] Add node name as prefix to observer module name (#44765)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44765

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_save_observer_state_dict

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D23741355

fbshipit-source-id: 7185ceae5b3b520ac0beebb627c44eab7ae7d231
2020-09-17 10:17:42 -07:00
Xiang Gao
20ac736200 Remove py2 compatible future imports (#44735)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44735

Reviewed By: mruberry

Differential Revision: D23731306

Pulled By: ezyang

fbshipit-source-id: 0ba009a99e475ddbe22981be8ac636f8a1c8b02f
2020-09-16 12:55:57 -07:00
Supriya Rao
3f512b0de2 [quant][qat] Ensure observers and fq modules are scriptable (#44749)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44749

Ensure fx module is scriptable after calling prepare_qat on it

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_qat_and_script

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D23718380

fbshipit-source-id: abf63ffb21e707f7def8f6c88246877f5aded58c
2020-09-16 09:30:07 -07:00
Jerry Zhang
e594c30bc2 [quant][graphmode][fx] Support fp16 dynamic quantization for linear (#44582)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44582

Test Plan:
test_quantize_fx.py

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23665974

fbshipit-source-id: 19ba6c61a9c77ef570b00614016506e9a2729f7c
2020-09-14 15:43:08 -07:00
Zafar
742654d1b6 [quant] ConvTranspose1d / ConvTranspose2d (#40371)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40371

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D22158981

Pulled By: z-a-f

fbshipit-source-id: defbf6fbe730a58d5b155dcb2460dd969797215c
2020-09-14 14:25:06 -07:00
Jerry Zhang
b6f0ea0c71 [quant][graphmode][fx][fix] Remove qconfig in convert (#44526)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44526

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23641960

fbshipit-source-id: 546da1c16694d1e1dfb72629085acaae2165e759
2020-09-11 15:51:47 -07:00
Jerry Zhang
a82ea6a91f [quant][graphmode][fx][fix] Support None qconfig in convert (#44524)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44524

None qconfig is not handled previously
closes: https://github.com/pytorch/pytorch/issues/44438

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23640269

fbshipit-source-id: 8bfa88c8c78d4530338d9d7fa9669876c386d91f
2020-09-11 15:22:25 -07:00
Vasiliy Kuznetsov
70dfeb44bd MinMax based observers: respect device affinity for state_dict (#44537)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44537

Originally, the `min_val`, `max_val`, `min_vals`, `max_vals`
attributes of observers were Tensors but not buffers.  They had custom
state_dict save/load code to ensure their state was saved.

At some point, these attributes became buffers, and the custom
save/load code remained. This introduced a subtle bug:
* create model A, move it to a device (cpu/cuda) and save its state_dict
* create model B, load its state dict.
* `min_val|min_vals|max_val|max_vals` would always be loaded to model A's device, even if the rest of model B was on a different device
* the above is inconsistent with how save/load on different devices is expected to work (see https://pytorch.org/tutorials/beginner/saving_loading_models.html#saving-loading-model-across-devices)

In practice, the case people would sometimes hit is:
* model A is on CPU, state dict is saved
* model B is created and moved to GPU, state_dict from model A is loaded
* assertions throw when operations are attempted across different devices

This PR fixes the behavior by removing the custom save/load where
possible and letting the default `nn.Module` save/load code handle
device assignment.  We special case `PerChannelMinMaxObserver` and its
children to allow for loading buffers or different size, which is
normal.

There are some followups to also enable this for HistogramObserver
and FakeQuantize, which can be done in separate PRs due to higher
complexity.

Test Plan:
```
python test/test_quantization.py TestObserver.test_state_dict_respects_device_affinity
```

Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D23644493

fbshipit-source-id: 0dbb6aa309ad569a91a663b9ee7e44644080032e
2020-09-11 14:48:56 -07:00
Jerry Zhang
11fb51d093 [quant][graphmode][fx][fix] Support dictionary output (#44508)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44508

Bug fix for dictionary output

Test Plan: Imported from OSS

Reviewed By: z-a-f

Differential Revision: D23636182

fbshipit-source-id: 0c00cd6b9747fa3f8702d7f7a0d5edb31265f466
2020-09-11 11:29:20 -07:00
Jerry Zhang
0c58a017bd [quant][eagermode][refactor] Add set/get method for quantization and fusion mappings (#43990)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43990

Allow user to register custom quantization and fusion patterns

Test Plan: Imported from OSS

Reviewed By: z-a-f

Differential Revision: D23485344

fbshipit-source-id: 4f0174ee6d8000d83de0f73cb370e9a1941d54aa
2020-09-10 21:29:39 -07:00
Supriya Rao
646ffd4886 [quant] Move EmbeddingBag eager quantization to static (#44217)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44217

Move the tests to static ones as well

Test Plan:
python test/test_quantization.py TestStaticQuantizedModule.test_embedding_bag_api

Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D23547386

fbshipit-source-id: 41f81c31e1613098ecf6a7eff601c7dcd4b09c76
2020-09-08 19:05:02 -07:00
Supriya Rao
57b87aaf59 [quant] Add quantized Embedding module (#44208)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44208

Add quantized module in static quantization namespace. Embedding
quantization requires only weights to be quantized so it is static.
Internally it calls the embedding_bag_byte op with the offsets set corresponding to the
indices.

Future PR will move EmbeddingBag quantization from dynamic to static as well.

Test Plan:
python test/test_quantization.py test_embedding_api

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23547384

fbshipit-source-id: eddc6fb144b4a771060e7bab5853656ccb4443f0
2020-09-08 19:04:59 -07:00
Jerry Zhang
6269b6e0f0 [quant][graphmode][fx][api] Call fuse in prepare (#43984)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43984

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23459261

fbshipit-source-id: 6b56b0916d76df67b9cc2f4be1fcee905d604019
2020-09-08 18:09:26 -07:00
Jerry Zhang
9f54bcc522 [quant][graphmode][fx] Support inplace option (#43983)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43983

Support inplace option in apis

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23459260

fbshipit-source-id: 80409c7984f17d1a4e13fb1eece8e18a69ee43b3
2020-09-08 17:39:13 -07:00
Vasiliy Kuznetsov
00b5bd536f fx quant: add docblocks to _find_matches and _find_quants (#43928)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43928

Improving readability, no logic change.

Test Plan:
CI

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D23440249

fbshipit-source-id: a7ebfc7ad15c73e26b9a94758e7254413cc17d29
2020-09-08 16:13:11 -07:00
Jerry Zhang
43e38d60d6 [quant][graphmode][fx] Support quantize per channel in all cases (#44042)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44042

Missed one case last time

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23479345

fbshipit-source-id: 30e6713120c494e9fab5584de4df9b25bec83d32
2020-09-08 15:45:14 -07:00
Vasiliy Kuznetsov
fd8e2064e0 quant: switch observers to use min_max (#42957)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42957

Switches observers to use the new min_max function to calculate
min and max at the same time.  We see around 45-50% speedup on
representative input shapes on the microbenchmarks for all observers except `HistogramObserver`.

Test Plan:
CI for correctness

performance:
```
cd benchmarks/operator_benchmark
// repeat (before diff, after diff) x (cpu, cuda)
python -m pt.qobserver_test --tag_filter all --device cpu
/*
    * before, cpu: https://our.intern.facebook.com/intern/paste/P138633280/
    * before, cuda: https://our.intern.facebook.com/intern/paste/P138639473/
    * after, cpu: https://our.intern.facebook.com/intern/paste/P138635458/
    * after, cuda: https://our.intern.facebook.com/intern/paste/P138636344/
*/
```

Imported from OSS

Reviewed By: supriyar

Differential Revision: D23093995

fbshipit-source-id: 9f416d144109b5b80baf089eb4bcfabe8fe358d5
2020-09-08 11:39:44 -07:00
Vasiliy Kuznetsov
618b4dd763 fx quant prepare: clarify naming (#44125)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44125

In `Quantizer._prepare`, `observed` was used for two different variables
with different types.  Making the names a bit cleaner and removing the
name conflict.

Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
```

Imported from OSS

Reviewed By: dskhudia

Differential Revision: D23504109

fbshipit-source-id: 0f73eac3d6dd5f72ad5574a4d47d33808a70174a
2020-09-04 21:29:56 -07:00
Zachary DeVito
2ad5a82c43 [fx] get rid of graph_module.root (#44092)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44092

instead submodules and weights are installed directly on the
graph_module by transferring the original modules. This makes it more
likely that scripting will succeed (since we no longer have submodules
that are not used in the trace). It also prevents layered transforms
from having to special case handling of the `root` module. GraphModules
can now be re-traced as part of the input to other transforms.

Test Plan: Imported from OSS

Reviewed By: jamesr66a

Differential Revision: D23504210

Pulled By: zdevito

fbshipit-source-id: f79e5c4cbfc52eb0ffb5d6ed89b37ce35a7dc467
2020-09-04 11:35:32 -07:00
Vinod Kumar S
2a1fc56694 replace the white list from default mappings (#41802)
Summary:
Replaced "whitelist" from default_mappings.py
Fixes https://github.com/pytorch/pytorch/issues/41756

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41802

Reviewed By: ngimel

Differential Revision: D23521452

Pulled By: malfet

fbshipit-source-id: 019a2d5c06dc59dc53d6c48b70fb35b216299cf4
2020-09-04 10:04:28 -07:00
Vasiliy Kuznetsov
71510c60ad fx qat: respect device affinity (#44115)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44115

Fixes device affinity in the FX prepare pass for QAT. Before this PR, observers
were always created on CPU. After this PR, observers are created on the
same device as the rest of the model. This will enable QAT prepare to
work regardless of whether users move the model to cuda before or after
calling this pass.

Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_qat_prepare_device_affinity
```

Imported from OSS

Reviewed By: supriyar

Differential Revision: D23502291

fbshipit-source-id: ec4ed20c21748a56a25e3395b35ab8640d71b5a8
2020-09-03 16:16:59 -07:00
Meghan Lele
7816d53798 [JIT] Add mypy type annotations for JIT (#43862)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43862

Test Plan: Imported from OSS

Reviewed By: eellison

Differential Revision: D23491151

Pulled By: SplitInfinity

fbshipit-source-id: 88367b89896cf409bb9ac3db7490d6779efdc3a4
2020-09-03 15:09:24 -07:00
Vasiliy Kuznetsov
f9efcb646b fx quant: clarify state in Quantizer object (#43927)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43927

Adds uninitialized placeholders for various state
used throughout the Quantizer object, with documentation
on what they are. No logic change.

Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestQuantizeFx
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D23439473

fbshipit-source-id: d4ae83331cf20d81a7f974f88664ccddca063ffc
2020-09-02 16:34:00 -07:00
Vasiliy Kuznetsov
df8da5cb5a fx quant: make load_arg function more clear (#43923)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43923

Readability improvements to `Quantizer.convert.load_arg`, makes
things easier to read.
1. add docblock
2. `arg` -> `arg_or_args`, to match what's actually happening
3. `loaded_arg` -> `loaded_args`, to match what's actually happening

Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestQuantizeFx
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D23438745

fbshipit-source-id: f886b324d2e2e33458b72381499e37dccfc3bd30
2020-09-02 09:06:05 -07:00
Vasiliy Kuznetsov
77ef77e5fa fx quant: rename matches -> is_match (#43914)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43914

Renames `matches` function to `is_match`, since there is also
a list named `matches` we are passing around in `Quantizer`,
and would be good to decrease name conflicts.

Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D23435601

fbshipit-source-id: 394af11e0120cfb07dedc79d5219247330d4dfd6
2020-09-02 09:06:01 -07:00
Vasiliy Kuznetsov
6f5282adc8 add quantization debug util to pretty print FX graphs (#43910)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43910

Adds a debug function to get a representation of all nodes in the
graph, such as

```
name          op      target         args               kwargs
x             plchdr  x              ()                 {}
linear_weight gt_prm  linear.weight  ()                 {}
add_1         cl_fun  <bi_fun add>   (x, linear_weight) {}
linear_1      cl_mod  linear         (add_1,)           {}
relu_1        cl_meth relu           (linear_1,)        {}
sum_1         cl_fun  <bi_meth sum>  (relu_1,)          {'dim': -1}
topk_1        cl_fun  <bi_meth topk> (sum_1, 3)         {}
```

using only Python STL. This is useful for printing internal state of
graphs when working on FX code.

Has some on-by-default logic to shorten things so that node reprs for
toy models and unit tests fit into 80 chars.

Flexible on function name and location, I care more that this is
accessible from both inside PT as well as from debug scripts which
are not checked in.

Test Plan:
see
https://gist.github.com/vkuzo/ed0a50e5d6dc7442668b03bb417bd603 for
example usage

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D23435029

fbshipit-source-id: 1a2df797156a19cedd705e9e700ba7098b5a1376
2020-09-02 09:04:44 -07:00
Jerry Zhang
8fd9fe93be [quant][graphmode][fx] Support dynamic quantization without calibration (#43952)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43952

Run weight observer for dynamic quantization before inserting quant/dequant node

Test Plan: Imported from OSS

Reviewed By: supriyar

Differential Revision: D23452123

fbshipit-source-id: c322808fa8025bbadba36c2e5ab89f59e85de468
2020-09-01 19:09:48 -07:00
Jerry Zhang
0ffe3d84d5 [quant][graphmode][fx] Support dynamic quantization without calibration (#43892)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43892

Run weight observer in the convert function, so user do not need to run calibration

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23429758

fbshipit-source-id: 5bc222e3b731789ff7a86463c449690a58dffb7b
2020-09-01 17:01:48 -07:00
Jerry Zhang
d15b9d980c [quant][graphmode][fx][refactor] Move patterns to separate files (#43891)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43891

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23429759

fbshipit-source-id: f19add96beb7c8bac323ad78f74588ca1393040c
2020-09-01 16:37:33 -07:00
Jerry Zhang
825c109eb7 [reland][quant][graphmode][fx] Add support for weight prepack folding (#43728) (#43902)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43902

Trace back from the weight node util we hit getattr, reconstruct the graph module with the traced nodes
and run the graph module to pack the weight. then replace the original chain of ops with the packed weight.

Test Plan:
Imported from OSS

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23432431

fbshipit-source-id: 657f21a8287494f7f87687a9d618ca46376d3aa3
2020-09-01 00:26:19 -07:00
Jerry Zhang
7db7da7151 [reland][quant][graphmode][fx] Add top level APIs (#43581) (#43901)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43901

Add similar APIs like eager and graph mode on torchscript
- fuse_fx
- quantize_fx (for both post training static and qat)
- quantize_dynamic_fx (for post training dynamic)
- prepare_fx (for both post training static and qat)
- prepare_dynamic_fx (for post training dynamic)
- convert_fx (for all modes)

Test Plan:
Imported from OSS

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23432430

fbshipit-source-id: fc99eb75cbecd6ee7a3aa6c8ec71cd499ff7e3c1
2020-08-31 18:24:26 -07:00
Qi Zhou
f73ba88946 Avoid resizing in MinMaxObserver (#43789)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43789

Since it's single element.. In some cases we may not be able to resize the
buffers.

Test Plan: unit tests

Reviewed By: supriyar

Differential Revision: D23393108

fbshipit-source-id: 46cd7f73ed42a05093662213978a01ee726433eb
2020-08-31 17:41:39 -07:00
Alban Desmaison
f7bae5b6b1 Revert D23385091: [quant][graphmode][fx] Add top level APIs
Test Plan: revert-hammer

Differential Revision:
D23385091 (eb4199b0a7)

Original commit changeset: b789e54e1a0f

fbshipit-source-id: dc3dd9169d34beab92488d78d42d7e7d05e771d1
2020-08-31 12:18:29 -07:00
Alban Desmaison
68304c527a Revert D23385090: [quant][graphmode][fx] Add support for weight prepack folding
Test Plan: revert-hammer

Differential Revision:
D23385090 (ef08f92076)

Original commit changeset: 11341f0af525

fbshipit-source-id: fe2bcdc16106923a2cee99eb5cc0a1e9c14ad2c5
2020-08-31 12:17:28 -07:00
Jerry Zhang
ef08f92076 [quant][graphmode][fx] Add support for weight prepack folding (#43728)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43728

Trace back from the weight node util we hit getattr, reconstruct the graph module with the traced nodes
and run the graph module to pack the weight. then replace the original chain of ops with the packed weight.

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23385090

fbshipit-source-id: 11341f0af525a02ecec36f163a9cd35dee3744a1
2020-08-31 10:35:11 -07:00
Jerry Zhang
eb4199b0a7 [quant][graphmode][fx] Add top level APIs (#43581)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43581

Add similar APIs like eager and graph mode on torchscript
- fuse_fx
- quantize_fx (for both post training static and qat)
- quantize_dynamic_fx (for post training dynamic)
- prepare_fx (for both post training static and qat)
- prepare_dynamic_fx (for post training dynamic)
- convert_fx (for all modes)

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23385091

fbshipit-source-id: b789e54e1a0f3af6b026fd568281984e253e0433
2020-08-31 10:12:55 -07:00
Jerry Zhang
b8d34547ee [quant][graphmode][fx][fix] enable per channel quantization for functional ops (#43534)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43534

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D23310857

fbshipit-source-id: ff7a681ee55bcc51f564e9de78319249b989366c
2020-08-31 09:35:25 -07:00