Commit Graph

146 Commits

Author SHA1 Message Date
Sam Estep
4753100a3b Un-ignore F403 in .flake8 (#55838)
Summary:
Generally wildcard imports are bad for the reasons described here: https://www.flake8rules.com/rules/F403.html

This PR replaces wildcard imports with an explicit list of imported items where possible, and adds a `# noqa: F403` comment in the other cases (mostly re-exports in `__init__.py` files).

This is a prerequisite for https://github.com/pytorch/pytorch/issues/55816, because currently [`tools/codegen/dest/register_dispatch_key.py` simply fails if you sort its imports](https://github.com/pytorch/pytorch/actions/runs/742505908).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/55838

Test Plan: CI. You can also run `flake8` locally.

Reviewed By: jbschlosser

Differential Revision: D27724232

Pulled By: samestep

fbshipit-source-id: 269fb09cb4168f8a51fd65bfaacc6cda7fb87c34
2021-04-13 09:24:07 -07:00
Vasiliy Kuznetsov
ec9b20ddc0 fx quant: fix edge case with copynode after user function (#55710)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55710

In the current code, there is an edge case which leads to an error
after the prepare step:

1. have a pattern like this:

```
user_func_unmatched_to_qhandler -> node_matched_to_copy_node_qhandler
```

2. the user function returns a type which is not observable (i.e. not a
Tensor)

3. if this is run through `prepare_fx`, calibrating it with data leads
to a runtime error, because observers cannot observe non-tensor types.

This PR fixes the issue.  If a node matched to `CopyNodeQuantizeHandler`
is after an unmatched node, we delete the observer.

Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_no_obs_between_unmatched_node_and_copy_node
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D27686811

fbshipit-source-id: 320be41b1f383c6352ff89fb39a9f480822a3bb2
2021-04-12 08:47:44 -07:00
Jerry Zhang
3e8ebb17aa [reland][quant][graphmode][fx][refactor] Factor out insert_observers_for_model to a separate function (#54733) (#55307)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55307

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D27567475

fbshipit-source-id: 74b7db63f7e1e795e7ac7ed6027cf786d922e7bf
2021-04-09 17:56:55 -07:00
Jerry Zhang
4d449f915f [quant][graphmode][fx] Separate handling Copy operator to a helper function (#54644) (#55429)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55429

Previously we special case copy operator in normal insert observer code, this PR tries to split the
special case logic to a separate function and keep the rest of the code clean.

Test Plan:
Imported from OSS

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D27609972

fbshipit-source-id: 378f6aa70f18c0b477b62b6efe236648748aae7e
2021-04-08 22:12:24 -07:00
Bradley Davis
8eaa4a97b7 Back out "[quant][graphmode][fx] Separate handling Copy operator to a helper function" (#55388)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55388

temporarily revert D27314678 (c57541ce06), it appears to cause a perf regression that makes quantization of some models take too long to complete tests.

Reviewed By: houseroad

Differential Revision: D27583809

fbshipit-source-id: e9c088ccbfd3bfb3a1d4c7eafee3eca29ee7717b
2021-04-06 14:20:36 -07:00
Mike Ruberry
15f04e3466 Revert D27408378: [quant][graphmode][fx][refactor] Factor out insert_observers_for_model to a separate function
Test Plan: revert-hammer

Differential Revision:
D27408378 (c445f4ee93)

Original commit changeset: 9143f0a6f939

fbshipit-source-id: ae65ea798a6d72f2ec724c4c1b492937edddf721
2021-03-31 20:51:42 -07:00
Jerry Zhang
c445f4ee93 [quant][graphmode][fx][refactor] Factor out insert_observers_for_model to a separate function (#54733)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54733

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D27408378

fbshipit-source-id: 9143f0a6f939fa80f1d1d6bae4b2d37aa21cb9b9
2021-03-31 18:50:47 -07:00
Jerry Zhang
c57541ce06 [quant][graphmode][fx] Separate handling Copy operator to a helper function (#54644)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54644

Previously we special case copy operator in normal insert observer code, this PR tries to split the
special case logic to a separate function and keep the rest of the code clean.

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D27314678

fbshipit-source-id: d36870ceb3717bc01eaeaa6f3f1532ad562cbaf1
2021-03-31 17:50:32 -07:00
Jerry Zhang
c0d6dbdce4 [quant][fx][graphmode][refactor] Change activation_post_process_map to track the observer name instead (#54643)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54643

A refactor needed for future changes.

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D27314677

fbshipit-source-id: 972fbfb506f86da13f8817b3eaa5e6d0ad16ffe1
2021-03-31 17:50:30 -07:00
Jerry Zhang
c2adedf6fe [quant][graphmode][refactor] Remove reduandent code (#54073)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54073

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D27086067

fbshipit-source-id: b1995138de56f1352c5df03378ebc2832bf35ef7
2021-03-31 17:50:27 -07:00
Jerry Zhang
55544cb13a [quant][graphmode][fx] Add support for one value being quantized with different qconfigs (#53586)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53586

Previously one value can only be quantized to one dtype, this PR adds the support for quantizing one value
in the fx graph with multiple dtypes, e.g. first quantize to int8 and then float16

might do some followup PRs to clean up the hacks and refactor the code.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_multiple_qconfigs_single_value

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26912676

fbshipit-source-id: ae3653fd67f05870a3a9e808f491871826c555d5
2021-03-31 17:48:50 -07:00
Supriya Rao
6f63126b5c [quant][fx] Add pass in convert to fold quant-dequant sequence (#54860)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54860

Currently we insert a quantize_per_tensor op when we encounter the quantizable input,
so if it has multiple uses and not all are quantizable then we need to add a dequantize op
before these ops.

In this pass - For a sequence of quantize_per_tensor - dequantize, we combine them
since it is a no-op.

[internal only][pyper]

Before this change we had redundant dequantize nodes in the graph
Example 1x inline_cvr graph https://www.internalfb.com/intern/everpaste/?handle=GODBxAlUMzGHD6 (98143776f5)MSACpHKKu9qjorbsIXAAAz
 FC layers -> 37
 quantize_per_tensor -> 30
 dequantize -> 49

After this change
https://www.internalfb.com/intern/everpaste/?handle=GAl0uQnOlDNmpLoSAB-GZqRxu9wMbsIXAAAz
 FC layers -> 37
 quantize_per_tensor -> 30
 dequantize -> 39

We remove extra 10 dequantize nodes in the graph.

Test Plan:
python test/test_quantization.py test_fold_quant_dequant

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D27390506

fbshipit-source-id: 56e6fb8496171246eccf4bd45eb8bebd87fcb740
2021-03-30 08:40:24 -07:00
Vasiliy Kuznetsov
b81e10a291 fx quant: fix bug with fusion patterns and disabling quantization (#54654)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54654

Fixes a bug where disabling quantizaton on potential fusion patterns
would lead to errors in the `convert` function.  For example:
1. have a model with add-relu
2. disable quantization for the part of the model containing add-relu
3. run prepare and convert, the convert step would fail because
intermediate nodes were missing from `env`.

The fix is to add handling for this edge case.  If quantization is
disabled, we manually copy the nodes for multi-node fusion patterns.

Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_fusion_pattern_unquantized
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D27318454

fbshipit-source-id: 27c1fd1cb7c9711a8e8d338200971c428dae8f98
2021-03-25 22:21:41 -07:00
Yukio Siraichi
27048c1dfa Remove legacy constructor calls from _torch_ folder. (#53889)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/53146
Related to https://github.com/pytorch/pytorch/issues/47112

As mentioned in https://github.com/pytorch/pytorch/issues/47112, the plan is to:

1. Verify that all `torch.Tensor()` scenarios are covered by other functions
2. Scrub internal `torch.Tensor()` uses
3. Update the docs and throw `TORCH_WARN_ONCE` if someone uses `torch.Tensor()`

In this PR, I replaced all occurrences of `torch.Tensor` present in the _torch_ folder.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/53889

Reviewed By: walterddr, zou3519

Differential Revision: D27190743

Pulled By: jbschlosser

fbshipit-source-id: 7ecc201d57935b8dbb98ae3718b60d95cb55a010
2021-03-19 15:20:19 -07:00
Vasiliy Kuznetsov
4884a6ab51 fx quant: clean up names of quantize handlers (#53614)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53614

Ensures that every subclass of `QuantizeHandler` has a clear name.  This
prevents ambiguous names like `Cat`, which look like a module but are
really a quantize handler.

Test Plan:
```
python test/test_quantization.py TestQuantizeFx
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D26914784

fbshipit-source-id: 6dca7e27975c09f422f8e36f1d2b709bf3eaaadf
2021-03-12 07:43:53 -08:00
Vasiliy Kuznetsov
279b5372ab [not for land] fix fx quant for quant_layer -> stack -> sum (#53196)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53196

Before this PR, code patterns like this did not work:

```
x = some_quant_layer(x)
x = torch.stack([x, ...])
x = torch.sum(x, ...)
```

The reason this did not work is because `torch.sum` is treated as
"quantized" because of the newly added fp16 support, even though it is
not actually "quantized" for models where fp16 is not used.  We may
need to adjust the concept of "quantized vs non-quantized" into a
"dtype" for the longer term fix.

The current PR is a hacky fix to unblock.  We need to clean things
up before this is landable

Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_quant_sum
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D26783960

fbshipit-source-id: 3be7c3c1eaa2b8fcb99a105e1b0004c9ffd3a1c1
2021-03-12 07:43:50 -08:00
Vasiliy Kuznetsov
93d5807c1e [not for land yet]fix using size of quant layer in torch._assert (#53187)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53187

Before this diff, if we had code lik

```
x = any_quant_layer(...)
x_size0 = x.size(0)
torch._assert(x_size_0 == 1)
```

The convert code would try to insert a dequantize after `x_size0`,
because it was a descendant of a quantized node and it was needed
for a non-quantized operation.  Since the actual type of the `size`
function output is an integer, this does not make sense.

For now, this is fixed as a one-off to unblock a customer.  In the
future, we may need to think more deeply about all the functions which
can return non-quantized types from quantized tensors and make sure
they are all covered.

Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_assert_on_size_after_quant_layer
```

Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D26780690

fbshipit-source-id: 44cc25c9179d460efb3f110d40b73d854d676af5
2021-03-12 07:43:48 -08:00
Vasiliy Kuznetsov
ccab6680d5 [not for land yet] hacky fix for x.ndim followed by sub (#53120)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53120

Currently there is a pattern which is not handled correctly by
FX graph mode quantization:

```
def forward(self, x):
    ndim = x.ndim
    # or add, mul, div, etc
    x = torch.sub(x, ndim)
    return x
```

The reason this does not work is as follows:
1. x.ndim becomes a getattr node
2. the real world type of x.ndim is an integer, but this is not known from the graph (yet)
3. binary ops such as `torch.sub` require quantization of inputs
4. the framework inserts an observer to observe the output of `ndim`
5. the observer fails because `ndim` is not a Tensor

For now, we hack a bandaid to unblock some teams, none of this is for
land.  We will have to think of a better fix which is landable (TBD).

Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_getattr_with_nontensor_result
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D26756180

fbshipit-source-id: c0e498766b22c23df74fbb5aaeaa237c4c944263
2021-03-12 07:42:12 -08:00
Jerry Zhang
7484c56fa3 [quant][graphmode][fx] Fix a condition check for CopyNode (#53585)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53585

Previously fp16_static CopyNode would be marked as unquantized because of
an incorrect condition check of whether a Node is statically quantized or not.
This PR fixes that.

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26912677

fbshipit-source-id: 4ddb538714c5ba2db28430de5e1cf2931baf1993
2021-03-11 09:32:20 -08:00
Jerry Zhang
0584fd9339 [quant][fx][graphmode][fix] Only insert observers for fixed qparam ops (#53330)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53330

Fixed a condition check for fixed qparam ops, previously we were including CopyNodes as well

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_fixed_qparams_ops_fp16

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26836867

fbshipit-source-id: 8c486155244f852e675a938c3f4237f26505671c
2021-03-10 16:51:36 -08:00
hyperfraise
f9185973d1 [quantization] Add some support for 3d operations (#50003)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/50002

The last commit adds tests for 3d conv with the `SubModelFusion` and `SubModelWithoutFusion` classes.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50003

Reviewed By: mrshenli

Differential Revision: D26325953

Pulled By: jerryzh168

fbshipit-source-id: 7406dd2721c0c4df477044d1b54a6c5e128a9034
2021-03-10 16:40:35 -08:00
Supriya Rao
7cec4b3d4a [quant][fx] add _remove_qconfig flag to convert_fx (#53166)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53166

Context: For fx modules that consist of scriptmodules, calling
delattr(module, 'qconfig') throws an attribute error. will follow up
with a separate issue/repro to fix this problem

This PR adds a temporary flag to convert_fx API to preserve the qconfig attributes on the converted model
We will remove this flag once we reach a conclusion on calling delattr on scriptmodules

Test Plan:
python test/test_quantization.py test_preserve_qconfig

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D26771518

fbshipit-source-id: 9fd72816576856ffb4aa11f8fde08303d1df10a2
2021-03-03 12:58:05 -08:00
Jerry Zhang
d40b501cfc [quant][graphmode][fx][fp16] Add fp16 support for sigmoid (#52863)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52863

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_fixed_qparams_ops_fp16

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26672273

fbshipit-source-id: 30d5befe2a24081ac12ac773df4d2bd26d2d0192
2021-03-02 02:11:21 -08:00
Jerry Zhang
096bea5251 [reland][quant][graphmode][fx][fp16] Add fp16 support for {add|mul}{_relu} (#52714) (#53019)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53019

Test Plan:
python test/test_quantization.py TestQuantizedOps.test_add
python test/test_quantization.py TestQuantizedOps.test_mul
python test/test_quantization.py TestQuantizedOps.test_add_relu
python test/test_quantization.py TestQuantizedOps.test_mul_relu

Imported from OSS

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26725350

fbshipit-source-id: 2a89f5da6a21908f454f870521d2a4549fdd291e
2021-03-01 13:19:42 -08:00
Mike Ruberry
312b297b82 Revert D26626092: [quant][graphmode][fx][fp16] Add fp16 support for {add|mul}{_relu}
Test Plan: revert-hammer

Differential Revision:
D26626092 (2962fbb03c)

Original commit changeset: 91d040efa51e

fbshipit-source-id: cc6bcc0f451d6adcd7bf7572451e6e3cd6ad59d1
2021-03-01 04:52:47 -08:00
Jerry Zhang
2962fbb03c [quant][graphmode][fx][fp16] Add fp16 support for {add|mul}{_relu} (#52714)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52714

Test Plan:
python test/test_quantization.py TestQuantizedOps.test_add
python test/test_quantization.py TestQuantizedOps.test_mul
python test/test_quantization.py TestQuantizedOps.test_add_relu
python test/test_quantization.py TestQuantizedOps.test_mul_relu

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26626092

fbshipit-source-id: 91d040efa51e9c955eb688ec16a30f0c12233958
2021-02-27 22:12:10 -08:00
Jerry Zhang
0818dbf49d [quant][refactor] Merge add and mul handler (#52651)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52651

Merging them for easier extensions to fp16 and more binary ops

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26600118

fbshipit-source-id: a1816e593cf3065afe87d2e6e44cdace13bf6aeb
2021-02-27 19:56:32 -08:00
Jerry Zhang
b685864f50 [quant][graphmode][fx] Add reference option support for linear_static_fp16 (#52650)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52650

linear_dynamic_fp16 has following dtypes for activation, weight, bias, output:
(fp32, fp16, fp32, fp32)

linear_static_fp16 has following dtypes:
(fp16, fp16, fp16, fp16)

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26599803

fbshipit-source-id: b4a8345d355125070be718a227288cc848cc8bbc
2021-02-27 08:25:44 -08:00
Jerry Zhang
177694681e [quant][graphmode][fx] Add reference option support for linear_dynamic_fp16 (#52534)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52534

Currently linear_dynamic_fp16 has a signature that's tied to fbgemm/qnnpack
We'll need to produce a pattern equivalent to linear_dynamic_fp16 to support extensions
to other backends

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_linear_dynamic_fp16

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26557726

fbshipit-source-id: 270c9f781f73c79416a092b7831294cabca84b0c
2021-02-26 21:12:22 -08:00
Jerry Zhang
cb6b65699f [quant][graphmode][fx] Add support for packed params in state_dict (#51639)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51639

Test Plan: Imported from OSS

Reviewed By: z-a-f

Differential Revision: D26228185

fbshipit-source-id: 6cf8b4106fec9c6900521a2afe0de6f3d29cc896
2021-02-26 15:13:50 -08:00
Jerry Zhang
626756ac39 [quant][graphmode][api] debug --> reference (#52179)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52179

Rename debug to reference. We'll use this to produce a reference quantized model
that can be used as a common interface between pytorch quantized model and backends.

Test Plan:
python test/test_quantization.py TestQuantizeFx

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26424656

fbshipit-source-id: a0299b023f6ba7d98f5750724c517b0ecb987b35
2021-02-19 14:20:01 -08:00
Supriya Rao
916af892b3 [quant][fx] Update name of packed weight attributes (#51259)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51259

Store the FQN of the module that is using the packed weights (the quantized op)

In the case of fusion we update the scope mapping to store the module path of the fused node.

Test Plan:
python test/test_quantization.py test_packed_weight_fused_op

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D26117964

fbshipit-source-id: 9d929997baafb1c91063dd9786a451b0040ae461
2021-01-28 20:31:08 -08:00
Supriya Rao
288b94a8ee [quant][fx] Make scale, zero_point buffers in the model, use FQN (for quantize_per_tensor ops) (#51171)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51171

Following up on previous PR, this PR makes scale and zero_point for quantize_per_tensor to be
registered as buffers in the module.
Currently the dtype is still stored as attr (not registered as buffer) since we can only register tensor types.

Test Plan:
python test/test_quantization.py test_qparams_buffers

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D26092964

fbshipit-source-id: a54d914db7863402f2b5a3ba2c8ce8b27c18b47b
2021-01-28 08:35:46 -08:00
Supriya Rao
4c3f59b70e [quant][fx] Make scale, zero_point buffers in the model and use FQN (for quantized ops) (#51166)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51166

Currently scale and zero_point values are stored as constant values in the graph.
This prevents these values from being updated in the graph and also does not enable saving
these values to state_dict

After this PR we store scale/zero_point values for quantized ops as buffers in the root module
and createe get_attr nodes for them in the graph.

We also use the FQN of the module where the quantized ops are present to name these attributes so
that they can be uniquely  identified and mapped to quantized ops.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_qparams_buffers

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D26092965

fbshipit-source-id: b549b2d3dccb45c5d38415ce95a09c26f5bd590b
2021-01-28 08:35:42 -08:00
Supriya Rao
096adf4b8b [quant][fx] Scope support for call_function in QuantizationTracer (#51086)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51086

Previously we only supported getting scope for call_module and custom qconfig dict for call_module.
This PR extends the Scope class to record the scope for all node types.
For call_function qconfig if module_name is specified it takes precedence over function qconfig.

Test Plan:
python test/test_quantization.py test_qconfig_for_call_func

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D26077602

fbshipit-source-id: 99cdcdedde2280e51812db300e17d4e6d8f477d2
2021-01-28 08:32:24 -08:00
Jerry Zhang
f10e7aad06 [quant][graphmode][fx] Scope support for call_method in QuantizationTracer (#50173)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50173

Previously we did not set the qconfig for call_method node correctly since it requires us to know
the scope (module path of the module whose forward graph contains the node) of the node. This
PR modifies the QuantizationTracer to record the scope information and build a map from call_method
Node to module path, which will be used when we construct qconfig_map

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_qconfig_for_call_method

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D25818132

fbshipit-source-id: ee9c5830f324d24d7cf67e5cd2bf1f6e0e46add8
2021-01-11 10:43:58 -08:00
Jerry Zhang
f6f0fde841 [reland][quant][graphmode][fx] Standalone module support {input/output}_quantized_idxs (#49754) (#50058)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50058

This PR adds the support for {input/output}_quantized_idxs for standalone module.

if input_quantized_idxs = [] and output_quantized_idxs = [], the standalone module will be expecting float
input and produce float output, and will quantize the input and dequantize output internally

if input_quantized_idxs = [0] and otuput_qiuantized_idxs = [0], the standalone module will be expecting quantized
input and produce quantized output, the input will be quantized in the parent module, and output will be dequantized
in the parent module as well, this is similar to current quantized modules like nn.quantized.Conv2d

For more details, please see the test case

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_standalone_module

Imported from OSS

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D25768910

fbshipit-source-id: 96c21a3456cf192c8f1400afa4e86273ee69197b
2021-01-05 20:27:46 -08:00
Mike Ruberry
46cf6d332f Revert D25684692: [quant][graphmode][fx] Standalone module support {input/output}_quantized_idxs
Test Plan: revert-hammer

Differential Revision:
D25684692 (89b4899ea5)

Original commit changeset: 900360e01c0e

fbshipit-source-id: 8b65fa8fbc7b364fbddb5f23cc696cd9b7db98cd
2020-12-24 15:50:52 -08:00
Jerry Zhang
89b4899ea5 [quant][graphmode][fx] Standalone module support {input/output}_quantized_idxs (#49754)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49754

This PR adds the support for {input/output}_quantized_idxs for standalone module.

if input_quantized_idxs = [] and output_quantized_idxs = [], the standalone module will be expecting float
input and produce float output, and will quantize the input and dequantize output internally

if input_quantized_idxs = [0] and otuput_qiuantized_idxs = [0], the standalone module will be expecting quantized
input and produce quantized output, the input will be quantized in the parent module, and output will be dequantized
in the parent module as well, this is similar to current quantized modules like nn.quantized.Conv2d

For more details, please see the test case

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_standalone_module

Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D25684692

fbshipit-source-id: 900360e01c0e35b26fe85f4a887dc1fd6f7bfb66
2020-12-23 22:36:57 -08:00
Jerry Zhang
f474ffa1a9 [quant][graphmode][fx] Change standalone module api (#49719)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49719

We find there are multiple use cases for standalone module, one use case requires standalone module
to produce a module that takes float Tensor as input and outputs a float Tensor, the other needs to
produce a modulee that takes quantized Tensor as input and outputs a quantized Tensor.

This is similar to `quantized_input_idxs` and `quantized_output_idxs` so we want to nest
prepare_custom_config_dict in the standalone module configuration, for maximum flxibility we also
include qconfig_dict for stand alone module as well in case user needs to have special qconfig_dict for
the standalone module in the future.

Changed from
```python
prepare_custom_config_dict =
{
  "standalone_module_name": ["standalone_module"],
   "standalone_module_class": [StandaloneModule]
 }
```
to
```python
prepare_custom_config_dict =
{
  "standalone_module_name": [("standalone_module", qconfig_dict1, prepare_custom_config_dict1)],
  "standalone_module_class": [(StandaloneModule, qconfig_dict2, prepare_custom_config_dict2)]
 }
```
The entries in the config are:
1. name/module_class
2. optional qconfig_dict, when it is None, we'll use {"": qconfig} where qconfig is the one from parent qconfig_dict
3. optional prepare_custom_config_dict, when it is None, we'll use default value of prepare_custom_config_dict for prepare API (None)

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_standalone_module

Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D25675704

fbshipit-source-id: 0889f519a3e55a7a677f0e2db4db9a18d87a93d4
2020-12-22 21:58:40 -08:00
Vasiliy Kuznetsov
de07d07600 fx quant: improve types on convert (#49688)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49688

Adds more types on FX quantize convert, fixing things as they
are uncovered by mypy.

Test Plan:
```
mypy torch/quantization
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D25667231

fbshipit-source-id: 262713c6ccb050a05e3119c0457d0335dde82d25
2020-12-22 16:53:23 -08:00
Vasiliy Kuznetsov
19f972b696 fx quant: do not observe bias on F.linear (#49628)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49628

Ensures that linear bias is not observed in a `F.linear` call. This should
be a small speedup in PTQ, and will change numerics (in a good way) for
QAT if someone is using `F.linear`.

Note: the implementation is slightly more verbose compared to conv
because bias is a keyword argument in Linear.

Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps.test_linear_functional_bias_not_observed
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D25653532

fbshipit-source-id: c93501bf6b55cbe4a11cfdad6f79313483133a39
2020-12-22 16:53:21 -08:00
Vasiliy Kuznetsov
c3a7591cef fx quant: do not observe bias on F.conv (#49623)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49623

(not ready for review)

Ensures that conv bias is not observed in a `F.conv{n}d` call.

Test Plan: Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D25652856

fbshipit-source-id: 884f87be1948d3e049a557d79bec3c90aec34340
2020-12-22 16:49:50 -08:00
Vasiliy Kuznetsov
edce6b138d fx quant: fix types on _find_quants (#49616)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49616

Add types to `_find_quants` I/O and fix resulting errors,
needed for an upcoming bug fix.

Test Plan:
```
mypy torch/quantization
python test/test_quantization.py TestQuantizeFx
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D25645719

fbshipit-source-id: 4bf788b55fd4fd086c83a4438b9c2df22b9cff49
2020-12-21 21:05:57 -08:00
Jerry Zhang
5cde23fdd4 [quant][graphmode][fx] Allow user to specify qconfig for call_method (#49621)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49621

This adds support to configure qconfig for a call_method, e.g. x.chunk, this will help workaround
a problem in our internal model.

TODO: since call_method is also a string and we flatten the qconfig, might need to resolve namespace conflict between
call_method and module_name
TODO: Add scope support to set the qconfig for call_method correctly with original qconfig

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D25651828

fbshipit-source-id: 82d66b121d37c8274fd481b6a2e9f9b54c5ca73d
2020-12-18 20:21:52 -08:00
Vasiliy Kuznetsov
82ac6c75af fx quant: make sure observer is inserted before a quantized output (#49420)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49420

Before: if an output was marked as quantized, it could actually not
be quantized, if the previous node was not quantized.

After: if an output was marked as quantized, it will be quantized
regardless of the quantization status of the previous node.

Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps.test_quant_output_always_observed
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D25566834

fbshipit-source-id: 84755a1605fd3847edd03a7887ab9f635498c05c
2020-12-16 18:53:37 -08:00
Vasiliy Kuznetsov
84506e0316 fx quant: fix fq when input is quantized and node does not need fq (#49382)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49382

Fixes an edge case.  If the input to the graph is quantized and the
first node does not need activation observation, makes sure that
the observer is not inserted.

Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps.test_int8_input_no_unnecessary_fq
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D25551041

fbshipit-source-id: a6cba235c63ca7f6856e4128af7c1dc7fa0085ea
2020-12-16 18:53:33 -08:00
Vasiliy Kuznetsov
7542076097 fx quant: do not insert observers at quantized inputs (#49239)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49239

Context: the existing implementation of `quantized_input_idxs` is convert-only.
Therefore, observers are inserted between the input and the first
quantized node.  This is a problem during QAT, because the initial
input is a fake_quant, and it starts with scale=1 and zp=0.  This does
not match the quantization parameters of the graph input, which can
lead to incorrect numerics.

Fix: do not insert observer for a quantized input.

Test Plan:
```
python test/test_quantization.py TestQuantizeFx
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D25499486

fbshipit-source-id: 303b49cc9d95a9fd06fef3b0859c08be34e19d8a
2020-12-16 18:53:30 -08:00
Vasiliy Kuznetsov
92df8706a0 fx quant: move {input|output}_quantized_idxs cfg from convert to prepare (#49238)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49238

Moves the `input_quantized_idxs` and `output_quantized_idxs` options
from the convert config to the prepare config.  This is done because
these operations are related to placing observers, which is numerics
changing during QAT.

The next PR will adjust the behavior of `input_quantized_idxs` in
prepare in QAT to prevent placing a fake_quant at the input if the
input is marked quantized.  Placing a fake_quant there can lead to
numerical inaccuracies during calibration, as it would start with
scale=1 and zp=0, which may be different from the quantization
parameters of the incoming quantized input.

Test Plan:
```
python test/test_quantization.py TestQuantizeFx
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D25498762

fbshipit-source-id: 17ace8f803542155652b310e5539e1882ebaadc6
2020-12-16 18:53:27 -08:00
Vasiliy Kuznetsov
d033e185ed fx quant: move more functions to utils (#48908)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48908

No logic change, improving readability

Test Plan:
CI

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D25363080

fbshipit-source-id: 1d73a875bd7abf671b544ebc835432fea5306dc3
2020-12-08 15:37:04 -08:00