Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59739
Created an EqualizationQConfig specifically for equalization.
This inherits from QConfig and is used to distinguish between inserting
an input observer with an output observer. Since the output observer
field is included in the EqualizationQConfig, we no longer need an
output observer field in the _InputEqualizationObserver
Test Plan:
compiles
Imported from OSS
Reviewed By: ezyang
Differential Revision: D29135298
fbshipit-source-id: 3dde9c029c291467ff0a0845f0fc9c44573fc6f6
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59605
Enables targeting of individual function invocations by execution order.
For example, given a module such as
```
class M1(torch.nn.Module):
def forward(self, x):
x = torch.add(x, x)
x = torch.add(x, x)
return x
class M2(torch.nn.Module):
def __init__(self):
self.m1 = M1()
def forward(self, x):
x = self.m1(x)
return x
```
We can now target the first add of `m1` with
```
qconfig_dict = {
"module_name_function_order": ("m1", torch.add, 0, custom_qconfig),
}
```
Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_qconfig_module_name_function_order
```
Imported from OSS
Reviewed By: hx89
Differential Revision: D28951077
fbshipit-source-id: 311d423724a31193d4fa4bbf3a712b46464b5a29
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59799
This is a redo of #58574, easier to create a new PR than to fix rebase
conflicts, as there have been a large number of refactors to the
underlying code.
Removes some code which was incorrectly added by #57519 but never
actually used for anything.
Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D29031955
fbshipit-source-id: f407d181070cb283382965952821e3647c705544
Summary: Implemented two observers (InputEqualObserver and WeightEqualObserver) which will be inserted into the graph during prepare_fx().
Test Plan: python test/test_quantization.py TestEqualizeFx
Reviewed By: supriyar
Differential Revision: D28836954
fbshipit-source-id: 25517dc82ae67698ed8b2dc334e3323286976104
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59042
To remove Quantizer class and split prepare and convert functions to different files
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D28724867
fbshipit-source-id: 9f87d51020caa20d5408cb2820947e23d92d5fc3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59041
Static quantization for Custom module support was removed in a previous refactor
https://github.com/pytorch/pytorch/pull/57519 since it's not covered by the test case
This PR re-enabled the test case and fixed the support
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D28724866
fbshipit-source-id: 1974675b88b56a2173daf86965d6f3fb7ebd783b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59040
To remove Quantizer class and split prepare and convert functions to different files
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D28724870
fbshipit-source-id: c0f748711b825cd46bdfcc05c054c77a41e8207a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59039
To remove Quantizer class and split prepare and convert functions to different files
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D28724874
fbshipit-source-id: bd984716b2da1d6879c3e92fa827574783a41567
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59038
To remove Quantizer class and split prepare and convert functions to different files
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D28724869
fbshipit-source-id: e8501c9720b5ddb654e78bc8fa08de0466c1d52b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59037
To remove Quantizer class and split prepare and convert functions to different files
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D28724865
fbshipit-source-id: 6c6824d0af7dd47d4c111d6a08e373bc65f33e08
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59036
To remove Quantizer class and split prepare and convert functions to different files
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D28724862
fbshipit-source-id: 5900420127fcc14846bc34c9ac29ff7e6a703f1e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59035
To remove Quantizer class and split prepare and convert functions to different files
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D28724872
fbshipit-source-id: d32752c635917c9820e5e7cc414ba9d48a258a19
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59034
To remove Quantizer class and split prepare and convert functions to different files
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D28724873
fbshipit-source-id: 870e0822843ad1d035f41eaa015bdde9ccf6ec23
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59033
To remove Quantizer class and split prepare and convert functions to different files
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D28724861
fbshipit-source-id: 97b38e851b6bf581510a24636b1d8d6f1d977f5a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59032
To remove Quantizer class and split prepare and convert functions to different files
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D28724868
fbshipit-source-id: 6df639f20076b480812b6dcf0fc7d2c87ca29d8b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59028
Previously we have an env and a quant_env in convert, which is a bit confusing,
in this PR we merged them and have a Dict[str, Tuple[Node, torch.dtype]]
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D28724863
fbshipit-source-id: 722a682c70d300a6ccd2b988786a1ac2d45e880e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57068
When training with histogram observer on, we got this runtime error:
```
torch/quantization/observer.py", line 942, in forward
self.bins)
self.histogram.resize_(combined_histogram.shape)
~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
self.histogram.copy_(combined_histogram)
self.min_val.resize_(combined_min.shape)
RuntimeError: cannot resize variables that require grad
```
Since this is the histogram observer that is used to collect histogram information, should not need gradient. So turn off the grad before resizing using `detach_()` method.
Test Plan:
- arc lint
- Train with histogram observer turned on, training finished successfully
f264139727
Reviewed By: supriyar
Differential Revision: D27147212
fbshipit-source-id: abed5b9c4570ffc6bb60e58e64791cfce66856cd
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57067
auto format the code
Test Plan: lint
Reviewed By: jerryzh168
Differential Revision: D27147213
fbshipit-source-id: 008871d276c8891b2411549e17617e5c27d16ee3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58792
Enabling support for fused modules like ConvReLU or LinearReLU on eager mode cross-layer equalization.
Test Plan:
`python test/test_quantization.py TestEqualizeEager`
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D28647242
fbshipit-source-id: 286e057ce70aa7de45d575afd6c13e55120ff18a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58566
Validates the keys of the qconfig_dict, prepare_custom_config_dict, convert_custom_config_dict, and
fuse_custom_config_dict. If the user passes in an invalid key or makes a type, we will throw and error and let the user know what keys are supported.
Test Plan:
Imported from OSS
python test/test_quantization.py
Reviewed By: jerryzh168
Differential Revision: D28540923
fbshipit-source-id: 5958c32017b7d16abd219aefc8e92c42543897c2
Summary:
Enable the quantization on XPU devices. Keep the model as is if the model is on XPU devices.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54857
Reviewed By: ailzhang
Differential Revision: D28501381
Pulled By: jerryzh168
fbshipit-source-id: 6d3e9b04075393248b30776c69881f957a1a837c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58453
Move the class method generate_qconfig_map to qconfig_utils, will add more PRs
to remove functions out of Quantizer and eventually remove the Quantizer object
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D28497965
fbshipit-source-id: 3c78cfe676965d20a8834a859ffed4d8e9ecade4
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58445
Previously the output of statically quantized fp16 operator is not quantized in QuantizeHandler, which is not consistent with
the behavior of static int8 operators. Also it does not work well with reference functions, this PR
changes the fp16 static QuantizeHandler to quantize (call to(torch.float16)) in the QuantizeHandler, this also
makes the future support for reference functions easier.
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D28495830
fbshipit-source-id: 2140eab8ab2dd08f6570d9e305485e3029e1f47d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58461
Improves the logic which calculates whether a node has any tensors
in its arguments by terminating the recursion early when possible.
In a future PR, we should probably ditch this entire approach and switch to
using dtype propagation.
Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D28499455
fbshipit-source-id: bedd844022b90e1fcb7d7a3cb4cc65440dc9cc59
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58416https://github.com/pytorch/pytorch/pull/57519 had a regression not
caught by CI, it added an assertion which failed on various model
output types.
This PR removes the assertion and adds the logic to observe graph
outputs in a way that supports arbitrary output formats.
Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_output_lists_and_dicts
```
Imported from OSS
Reviewed By: z-a-f
Differential Revision: D28479946
fbshipit-source-id: bcce301f98a057b134c0cd34ab0ca96ba457863f
Summary:
tl;dr; rewrites the FX graph mode quantization observer insertion to be easier to understand and extend.
The key conceptual difference from before is:
* before: for each node, observers are always inserted to the output of the current node, even if they are needed for the next node. This is hard to reason about.
* after: for each node, observers are inserted to the inputs (if needed, as calculated by the dtype of the argument and dtype of current node) and to the output (if needed for the type of pattern and qconfig). There is no knowledge of future nodes needed to insert observers for the current node.
This allows us to significantly simplify various things:
* all new observers needed for a node are inserted together. This makes it easier to understand and debug things. We add an invariant that node X will never change any observers inserted by any preceding or subsequent node, so to debug an issue the user can just understand what is happening for node X, without having to understand what happens before or after it.
* all the state tracking of activation_post_process_map and activation_post_process_indices are removed, instead observers are looked up by graph traversals
* since there is no longer a need for overlapping graph passes which mutate each other's interemediate state, it is easier to understand what the rules are for inserting observers, and to create new rules in the future.
Test Plan:
```
# all OSS tests pass
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
```
Imported from OSS
Differential Revision: D28241864
Reviewed By: jerryzh168
Pulled By: vkuzo
fbshipit-source-id: 950d58972d26362808564cc0a2dfb30413a3734d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57210
Removes the manually specified string name for sets of
related ops, and replaces it with an automatically generated
index. The manual name was arbitrary and ok for an MVP, but
is not safe for wide usage.
Also, adds APIs for users to add custom functions to the
relatedness map by either pairing it to a known function
or creating a new relatedness set.
Test Plan:
```
python test/test_quantization.py TestFXGraphMatcher
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D28077977
fbshipit-source-id: e64a1ad6cd063014d74cdad189b0a612b1143435
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57186
Before this PR, we matched any pair of nodes with equal or related
types.
This PR changes the behavior to only match nodes whose type is in
the allowlist (the relatedness mappings). This will prevent matching
user defined modules, unless users add them to the mappings.
This is motivated by a couple of things:
1. if user defined types are matched, it can break scriptability of the
model with loggers attached. This happens whenever the user module
has a return type of anything other than a Tensor or a tuple of
Tensors.
2. we tried the past behavior on a couple of models, and it hasn't been
useful.
Test Plan:
```
python test/test_quantization.py TestFXGraphMatcher
python test/test_quantization.py TestFXGraphMatcherModels
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D28077981
fbshipit-source-id: 0a698e52b807cda47e6923310448a985b26eb362
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57184
Add remaining types to the relationship mapping to have full coverage
of ops quantization knows about, except binary ops and RNNs.
Test Plan:
```
python test/test_quantization.py TestFXGraphMatcher.test_op_relationship_mapping
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D28077979
fbshipit-source-id: 0f6070c8a995032978702d088803f89ff25f2a7f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57171
No logic change, just moving the mapping to a file where
the other mappings are.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D28077978
fbshipit-source-id: 4049d6a498156a5dffe3a03d2f4abc79da7bf907
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57470
Removes the earlier hack of matching patterns originally matched
to BinaryOpQuantizeHandler to switch to CopyHandler. After this PR,
each pattern can only be matched to one type of QuantizeHandler or
to nothing.
Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D28152909
fbshipit-source-id: afc285e770bd7eb0518c90e3ee4874c421e78bbc
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57375
Skip observing the input for masked_fill. Currently we don't have a way to
query the type of Proxy in GraphModule, hopefully we should have the functionality to annotate the type,
we'll need to annotate a Proxy to be a boolean Tensor to remove this hack.
Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_boolean_tensor
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D28126003
fbshipit-source-id: 2989766370a607579b3ea07ca36cdc2ce35893cc
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57402
This is a cleanup, the value is not used by anything. It was
probably left behind after previous refactors.
Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D28133622
fbshipit-source-id: 44a3f955d4af8d6dd15b4fb3038188568e4ee549
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57399
There were a couple of functions which took `quants` as arguments
without using them, probably left over from after past refactors.
Cleaning this up to improve code readability.
Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D28132413
fbshipit-source-id: 636b146c0b5ef0caea9c4b539e245de245d48c49
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57393
Moves the information on whether we should pass the information
whether the output is quantized based on the inputs to live
on the qhandler object. This allows us to remove
FixedQParamsOpQuantizeHandler from quantize.py, further reducing
the coupling between handler objects and the quantization pass.
Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps
```
Imported from OSS
Reviewed By: astaff
Differential Revision: D28132414
fbshipit-source-id: 5c28524b47c00f618d3a38657376abae9e6ffe7c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57388
It's a bit confusing to have this be a decorator. It's simpler to
just expose it as a function on qhandler.
Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D28129411
fbshipit-source-id: f7316f285e8546c67e8d8cf753462b2c2abb2636
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57377
Moves the logic which determines
1. whether a pattern instance's output should be observed
2. whether a pattern instance's output should be marked as observed based on its inputs
3. whether to ovverride the activation specified in the qconfig
from `quantize.py` to `quantization_patterns.py`. This makes
the code easier to read and reduces the coupling between `Quantizer`
and `QuantizeHandler` instances.
Note: there are some further cleanups which would be good after this one
- leaving those for future PRs.
Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D28126896
fbshipit-source-id: 94c80a9c7307452783348d65b402acc84983e3f6
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57368
1. renames functions which only sometimes insert observers to start with `maybe_`,
to clarify the difference from functions which always insert observers
2. saves a level of indent in `maybe_insert_observer_for_output_of_the_node`
Test Plan:
```
python test/test_quantization.py TestQuantizeFx
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D28126897
fbshipit-source-id: 4cbc184dbf5e85954314cfbbcdd1551474175bf0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57367
This code is never hit (see insert_observer_for_output_of_the_node
which gates it out), so changing to an assert in order to
have `insert_observer` actually always insert an observer.
This helps code readability.
Test Plan:
```
python test/test_quantization.py TestQuantizeFx
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D28126898
fbshipit-source-id: 411bc37769a6eacbebc463ed6c84cac85871bd5e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56927
Adds the connection of `torch.add` to `toq.add_relu` and of `torch.mul`
to `toq.mul_relu`.
Test Plan:
CI
Imported from OSS
Reviewed By: supriyar
Differential Revision: D28003475
fbshipit-source-id: a12871feacf84c5afb0e1cc47e708e285695ffeb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57173
If getitem is followed by an unmatched node, we'll remove the observer after it.
Test Plan:
python test/test_quantization.pyt TestQuantizeFxOps.test_getitem
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D28068805
fbshipit-source-id: e79f8ec3e8fd61d348b8a7069ab0bb434d737c30
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57028
Adds a test case for wrapped sigmoid, and fixes the following issues
to make it pass in NS:
* allows comparing between x.sigmoid() and torch.sigmoid(x), if they are related
* allows dtype cast from FP32_OR_INT8 to FP32, via dequantize (this will be improved later)
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_user_defined_function
```
Reviewed By: jerryzh168
Differential Revision: D28030089
Pulled By: vkuzo
fbshipit-source-id: b237353e2d564a4476f409df461746a259015a4b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57027
Fixes a bug to allow shadowing of linear and conv functionals.
The bug is to only detach tensors, not all objects.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_int8_shadows_int8_fun
```
Reviewed By: jerryzh168
Differential Revision: D28030090
Pulled By: vkuzo
fbshipit-source-id: 0a38c4b232e007d7822eee818b0af99d98335d22
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57026
Adds a config option to skip matching classes by class type
and functions by function type.
This is useful when users make custom modules which return
types other than tensors. With the current implementation of
Logger, these are not scriptable.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_user_module_scriptable
```
Reviewed By: jerryzh168
Differential Revision: D28030093
Pulled By: vkuzo
fbshipit-source-id: 71dc54dd935d2071c4b017260ea2a1e5c2298bfe
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57025
Adds the ability to log unshadowed inputs of binary ops such as `add`
and `mul`, when indices 0, 1, or 0 and 1 are tensors.
Note: making shadowing support this is saved for a future PR.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_add_mul_inputs_activations
```
Reviewed By: jerryzh168
Differential Revision: D28030098
Pulled By: vkuzo
fbshipit-source-id: fd46760faac153975cd7688e70c44991ec1d5dff
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57024
Enables shadow copies of fp16 emulation patterns where weights
are cast to fp16 before being passed to linear. This previously
did not work because copying of `call_method` nodes was not implemented.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_linear_fp16_vs_linear_fp16_shadow_activations
```
Reviewed By: jerryzh168
Differential Revision: D28030096
Pulled By: vkuzo
fbshipit-source-id: 13a39ea6c106180df6d750246672286b58b4d04c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57022
Allows usage of user functions in NS shadow APIs. We expose the
i/o mapping to the user APIs, and thread them throughout the code.
Note: the format of the mapping is currently not the best. Saving
improving that for a future PR.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_user_defined_function
```
Reviewed By: jerryzh168
Differential Revision: D28030095
Pulled By: vkuzo
fbshipit-source-id: 2863312362223ad276437e2aeeec4a3f71b691c7
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57021
To support shadows of custom functions, we need to allow user to
specify I/O type of the custom functions.
This PR is a cleanup in preparation for making the above happen.
We make the I/O dtype mappings be generated by a function instead
of a global variable. In the next PR, we will add a hook so user
can modify these mappings.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
```
Reviewed By: jerryzh168
Differential Revision: D28030094
Pulled By: vkuzo
fbshipit-source-id: 3cbb617f034ef385c2875c4ec7fed13ca30bfc57
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56762
Adds a test case for wrapped sigmoid, and fixes the following issues
to make it pass in NS:
* allows comparing between x.sigmoid() and torch.sigmoid(x), if they are related
* allows dtype cast from FP32_OR_INT8 to FP32, via dequantize (this will be improved later)
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_user_defined_function
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27960766
fbshipit-source-id: 02935d2f400aa0b8f3d51bbf664a6c8ca89aa811
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56742
Fixes a bug to allow shadowing of linear and conv functionals.
The bug is to only detach tensors, not all objects.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_int8_shadows_int8_fun
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27960767
fbshipit-source-id: abc911ca4b9edafd1effb9dada7731981538c2df
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56493
Adds a config option to skip matching classes by class type
and functions by function type.
This is useful when users make custom modules which return
types other than tensors. With the current implementation of
Logger, these are not scriptable.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_user_module_scriptable
```
needs more testing before land
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27886107
fbshipit-source-id: ec92c4f7ab7141021bc022f07b3b558b42bbb986
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56408
Adds the ability to log unshadowed inputs of binary ops such as `add`
and `mul`, when indices 0, 1, or 0 and 1 are tensors.
Note: making shadowing support this is saved for a future PR.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_add_mul_inputs_activations
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27864296
fbshipit-source-id: 3cbeb728297aa192d1ea17e815299709fd9db056
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56384
Enables shadow copies of fp16 emulation patterns where weights
are cast to fp16 before being passed to linear. This previously
did not work because copying of `call_method` nodes was not implemented.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_linear_fp16_vs_linear_fp16_shadow_activations
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27857735
fbshipit-source-id: 7c1a067f035acf7322175f8535876d0ead88a86a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56301
Allows usage of user functions in NS shadow APIs. We expose the
i/o mapping to the user APIs, and thread them throughout the code.
Note: the format of the mapping is currently not the best. Saving
improving that for a future PR.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_user_defined_function
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27833189
fbshipit-source-id: dac418e294d1c9b204efbf4071d5cc12a9e784c0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56296
To support shadows of custom functions, we need to allow user to
specify I/O type of the custom functions.
This PR is a cleanup in preparation for making the above happen.
We make the I/O dtype mappings be generated by a function instead
of a global variable. In the next PR, we will add a hook so user
can modify these mappings.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27831996
fbshipit-source-id: 782f5e77de0eef3899b9b7def0fdabd8dcafef12
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56292
Adds hooks for specifying user defined functions to NS weight and
unshadowed activation APIs.
Adding it to shadowed activation APIs will be a bit more work, upcoming
in a separate PR.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_user_defined_function
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27830409
fbshipit-source-id: 6bbddc3062c0b3e412a3147244795319c0785a92
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56283
Exposes the `base_name_to_sets_of_related_ops` variable
to the graph matching API, so that users can add relationships
for custom functions. This is needed to enable full support of
external functions for custom backends.
The next PR will extend this to the NS APIs.
Test Plan:
```
python test/test_quantization.py TestFXGraphMatcher.test_user_defined_function
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27830410
fbshipit-source-id: 8688cf697d388c52e3d18f108765edfca3c3d3aa
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56550
Add support for preserving a list of attributes on observed/quantized GraphModule
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_deepcopy_preserve_attributes
Imported from OSS
Reviewed By: vkuzo, kazhang
Differential Revision: D27899317
fbshipit-source-id: ebf21334715e5ab764aaa27eed534cc0cdf9f2b5
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54924
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).
Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D27416528
fbshipit-source-id: 896c280abec2903c29d597c655729666583ff0dd
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56004
added reference pattern support for GELU, softmax and bmm for int dtypes. For GELU and Softmax, this consisted of adding reference patterns to the default node handler for int dtypes. Note GELU and softmax patterns are not registered since they do not have a proper quantized kernel which means they would either add unnecessary dequant and quant ops to the network, or they would simply error. This can be circumvented with custom qconfig usage as in test_gelu_reference
bmm was added within binary ops along with some significant changes to how that code is structured. Theoretically the reference pattern used for bmm could be applied to other dtypes. This was not enabled because of issues relating to Line 1323 in quantize.py. In essence, the prepare step does not know whether an op will use a reference pattern or not, so for ops that are supported with one dtype in reference and one dtype normally, this has the potential to cause issues. This is difficult to get aorund with the is_reference flag being available in the prepare step or discussed changes around separating
Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_gelu_reference
python test/test_quantization.py TestQuantizeFxOps.ttest_gelu_normal
python test/test_quantization.py TestQuantizeFxOps.test_softmax_reference
python test/test_quantization.py TestQuantizeFxOps.test_softmax_normal
python test/test_quantization.py TestQuantizeFxOps.test_silu_reference
python test/test_quantization.py TestQuantizeFxOps.test_bmm_int_reference
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestFuseFx
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxModels
Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D27818340
fbshipit-source-id: de65be0797035463cd2d1b0e4677d1a87f69143c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56391
Previously we only support keeping output quantized for tensor output, this PR adds support
for list and dict (values) as well
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D27860327
fbshipit-source-id: e770160ced47a7173abff5505ec620bd2b1a0b01
Summary:
As this diff shows, currently there are a couple hundred instances of raw `noqa` in the codebase, which just ignore all errors on a given line. That isn't great, so this PR changes all existing instances of that antipattern to qualify the `noqa` with respect to a specific error code, and adds a lint to prevent more of this from happening in the future.
Interestingly, some of the examples the `noqa` lint catches are genuine attempts to qualify the `noqa` with a specific error code, such as these two:
```
test/jit/test_misc.py:27: print(f"{hello + ' ' + test}, I'm a {test}") # noqa E999
test/jit/test_misc.py:28: print(f"format blank") # noqa F541
```
However, those are still wrong because they are [missing a colon](https://flake8.pycqa.org/en/3.9.1/user/violations.html#in-line-ignoring-errors), which actually causes the error code to be completely ignored:
- If you change them to anything else, the warnings will still be suppressed.
- If you add the necessary colons then it is revealed that `E261` was also being suppressed, unintentionally:
```
test/jit/test_misc.py:27:57: E261 at least two spaces before inline comment
test/jit/test_misc.py:28:35: E261 at least two spaces before inline comment
```
I did try using [flake8-noqa](https://pypi.org/project/flake8-noqa/) instead of a custom `git grep` lint, but it didn't seem to work. This PR is definitely missing some of the functionality that flake8-noqa is supposed to provide, though, so if someone can figure out how to use it, we should do that instead.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56272
Test Plan:
CI should pass on the tip of this PR, and we know that the lint works because the following CI run (before this PR was finished) failed:
- https://github.com/pytorch/pytorch/runs/2365189927
Reviewed By: janeyx99
Differential Revision: D27830127
Pulled By: samestep
fbshipit-source-id: d6dcf4f945ebd18cd76c46a07f3b408296864fcb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56294
When matching a pattern to `BinaryOpQuantizeHandler`, we need to make
sure we check for dtype support on the base node, instead of the current
node. This is important in cases such as `add-relu` and `mul-relu`,
when the current node is `relu`, but the base node is `add|mul`.
Test Plan:
```
python test/test_quantization.py TestQuantizeFx
```
There is no good test case to check this in current logic. Created an
add-relu model manually, and verified with pdb that the add node was
being used to match against dtypes.
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27831070
fbshipit-source-id: 3697f1328dff9fec3eb910bae49a73793ef36d63
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54813
Previously we have a cat that takes a list of Tensors with different qparams and dequantize them
cacatenate them and requantize with the output qparams. This adds some unnecessary overhead in dequantizing
and quantizing Tensors.
This PR adds an optimization for cat operator, we'll make sure inputs and output of cat
uses same observer/fake_quant and produce a cat that does not do rescaling.
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D27408377
fbshipit-source-id: 6a4bdcfd15e57ea1fe0f7e72d1e1288eb3ece4db
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56205
Allows for int8 modules to shadow int8 modules. This is useful when
comparing quantized models with different qconfigs.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_int8_shadows_int8
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27807405
fbshipit-source-id: 10c3bc7ab9bb1e6808aa1af23a34c7cf380465fd
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56195
This is outdated, removing (forgot to clean up in a previous PR).
Test Plan:
```
python test/test_quantization.py TestFXGraphMatcher
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27805334
fbshipit-source-id: 3b035945b4928a3c727e96e0f7fe0efe201f42c0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56194
Enables the NS graph matcher to also match `call_method` nodes.
These are useful for ops such as `torch.sigmoid`.
Test Plan:
```
python test/test_quantization.py TestFXGraphMatcher.test_methods
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27805333
fbshipit-source-id: 509ae283db6b245671f11e3eb6b7fcb3a5735ef5
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55918
Adds coverage for determining I/O dtype for various ops. This will
enable shadowing of these ops.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_op_io_dtype_coverage
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27740661
fbshipit-source-id: c5ce873ec56bffa50ca46d2fe134c70ed677e37e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55859
Adds mappings for ops which can accept either fp32 or int8 input,
such as `F.relu`. A future PR will fill out the op coverage.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_op_with_either_fp32_or_int8_input
```
Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D27740659
fbshipit-source-id: cfc3dd58319b7161ca7f1fe05cd22d9a3ff11141
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55858
Moves the mappings of input and output dtypes of various ops
into its own file, and makes the variable names more clear. No logic
change.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
```
Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D27740662
fbshipit-source-id: d384e7e542d9cc868d9cee9c53c2ac2f74a15a48
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55837
Adds a test that checks that all of the relevant op pairs defined in
`quantization_mappings.py` are also defined as related by Numerical
Suite.
Note: this does not cover all the ops, just the ones in
`quantization_mappings.py`. A future PR will fill out the remainder.
Test Plan:
```
python test/test_quantization.py TestFXGraphMatcher.test_op_relationship_mapping
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27719979
fbshipit-source-id: 9e852ef94da5f7a653ea15ba52c68a89c8e30208
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55803
Makes the NS `graph_matcher.get_reversed_fusions` use the fusions
defined the FX quantization code instead of duplicating them.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27719980
fbshipit-source-id: 12e3183405181bb9001f10e765cfb4d2ffdfdd88
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55506
Makes the NS weight extraction tests also test QAT, and fixes
the mappings where necessary to cover all the fusions and make
the tests pass.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_extract_weights_mod_ptq
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_extract_weights_mod_qat
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27650409
fbshipit-source-id: c5bd9268d1bc559afc27d4c5109effd77bf1538a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55505
This necessary to add support in NS for QAT modules, to avoid
duplicating logic between NSTracer and QuantizationTracer.
The eng work to expose the custom module and class names to
the user will be in a future PR.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27650407
fbshipit-source-id: 431f47c5353b41c11371c5efa79657bfd085459a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55434
Before this PR, there was some hacky logic which determined
the input and output types of nodes based on heuristics such
as inspecting `__module__`, or assuming that an op has an
I/O dtype of `torch.float` when the heuristics did not find
any matches. This is problematic because the heuristics were not exact,
and this could result in non-sensical shadow graphs when the heuristics
would return an incorrect dtype.
This PR switches the dtype determination to an allowlist system,
where we specify exactly what the dtypes are for the nodes or modules
which are in an allowlist, and we add an `UNKNOWN` type for everything
else. The shadow logic is changed to skip inserting shadows on any
function or module where the I/O dtype is unknown.
The current allowlist only contains functions necessary for the
currently existing tests. Filling out the allowlist with all necessary
torch functions is left for a future PR.
As a result of this, we can do the following (also implemented in this PR):
1. enable graph matching on nodes with equal types (for example,
F.linear and F.linear). The restriction that only nodes with equal types
was in the code as a placeholder, it's better to allow comparisons of
nodes of equal types. One case where this is useful is unshadowed
activations.
2. enable models with user defined modules to be passed to Numeric Suite
APIs without errors.
Test Plan:
```
python test/test_quantization.py TestFXGraphMatcher
python test/test_quantization.py TestFXGraphMatcherModels
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27622418
fbshipit-source-id: 40dcba0222c01154c141467640c1eb89725f33a7
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55431
Fixes a bug in the test cases, returning early resulted
in some tests not being run. Adds logic for `nni.LinearReLU`,
which was unmasked by making the tests run
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_extract_weights_mod
```
Imported from OSS
Reviewed By: hx89
Differential Revision: D27622415
fbshipit-source-id: 79d9e3125e5d881d9d13645abbe4bd007a5e1d44
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55311
Before this PR, `F.conv1d` was matched by FX graph mode quant patterns
but the prepacking was happening inline. There was also a bug with
argument type mismatch.
This PR fixes both issues and adds a test. Thanks jerryzh168 for the
code tip.
Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_functional_not_reference
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27575422
fbshipit-source-id: 42301e23cb101a9e64e46800813bc771317e233e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55287
Adds support for extracting weights from F.conv2d and F.conv3d.
F.conv1d and the fused variants are saved for future PRs.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_extract_weights_conv_fun
```
Imported from OSS
Reviewed By: hx89
Differential Revision: D27575424
fbshipit-source-id: e945912d7d0ab320f47cab30d00d60ddb7497158
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55154
Adds functionality to NS to allow matching nodes which have the
same signature across dtypes. For now, only the skeleton is added,
we can fill out the rest of the ops later. This is to unblock
the work to change `cat` to have the same signature for fp32 and int8,
and keep the testing we have for `cat` in NS.
For context, the main reason we are not matching nodes with equal types,
for now, is user defined types for which we do not know the signature.
For now, the design is strictly allowlist of everything. In the future,
we may adjust the design to safely match user defined types.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_ops_with_same_fp32_and_int8_signature
python test/test_quantization.py TestFXGraphMatcher.test_nodes_with_equal_types_do_not_get_matched
```
Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D27504624
fbshipit-source-id: 4f8eb4f3258caf6f99aa373ca7ba516ebbcf4779
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55727
number of dequantize for fp16 reference pattern was incorrect before, this
PR fixes the problem
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D27713390
fbshipit-source-id: 72b8d4cda0bdcea74abe27a76f918d1b47819b01
Summary:
Generally wildcard imports are bad for the reasons described here: https://www.flake8rules.com/rules/F403.html
This PR replaces wildcard imports with an explicit list of imported items where possible, and adds a `# noqa: F403` comment in the other cases (mostly re-exports in `__init__.py` files).
This is a prerequisite for https://github.com/pytorch/pytorch/issues/55816, because currently [`tools/codegen/dest/register_dispatch_key.py` simply fails if you sort its imports](https://github.com/pytorch/pytorch/actions/runs/742505908).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55838
Test Plan: CI. You can also run `flake8` locally.
Reviewed By: jbschlosser
Differential Revision: D27724232
Pulled By: samestep
fbshipit-source-id: 269fb09cb4168f8a51fd65bfaacc6cda7fb87c34
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55710
In the current code, there is an edge case which leads to an error
after the prepare step:
1. have a pattern like this:
```
user_func_unmatched_to_qhandler -> node_matched_to_copy_node_qhandler
```
2. the user function returns a type which is not observable (i.e. not a
Tensor)
3. if this is run through `prepare_fx`, calibrating it with data leads
to a runtime error, because observers cannot observe non-tensor types.
This PR fixes the issue. If a node matched to `CopyNodeQuantizeHandler`
is after an unmatched node, we delete the observer.
Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_no_obs_between_unmatched_node_and_copy_node
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27686811
fbshipit-source-id: 320be41b1f383c6352ff89fb39a9f480822a3bb2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55529
x.shape outputs a non-Tensor, add this to the all_node_args_have_no_tensors function
to avoid inserting observer for the getattr "shape" node.
Test Plan: Imported from OSS
Reviewed By: wat3rBro
Differential Revision: D27628145
fbshipit-source-id: 4729294ab80c0a1e72440396d31e7e82257b1092
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55429
Previously we special case copy operator in normal insert observer code, this PR tries to split the
special case logic to a separate function and keep the rest of the code clean.
Test Plan:
Imported from OSS
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D27609972
fbshipit-source-id: 378f6aa70f18c0b477b62b6efe236648748aae7e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55388
temporarily revert D27314678 (c57541ce06), it appears to cause a perf regression that makes quantization of some models take too long to complete tests.
Reviewed By: houseroad
Differential Revision: D27583809
fbshipit-source-id: e9c088ccbfd3bfb3a1d4c7eafee3eca29ee7717b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55078
Fixes a TODO, make sure we iterate through kwargs as well as args
when navigating graphs. We can use `node.all_input_nodes` convenience
property to accomplish this.
Test Plan:
```
python test/test_quantization.py TestFXGraphMatcher
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27474699
fbshipit-source-id: 8a6e3db5a73328c4f296ac5fce951e81213b6f58
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55077
Deletes debugging prints from the code, no logic change.
Test Plan:
CI
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27474700
fbshipit-source-id: 3d9d73da6615ddffdfdb0df270bcdfd2c4b50be3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55060
Removes the previous iteration of Numeric Suite for FX graph mode
quantization, and moves the current iteration into the top level
file.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
python test/test_quantization.py TestFXGraphMatcher
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27467725
fbshipit-source-id: 4c22b5a3221857231f9f59cf6d2908820e6a7f12
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54335
Simple fix to enable weight extraction for nni.ConvReLU2d.
Note: this module only appears if the internal GraphModule APIs are
called, so we add testing for this path.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_extract_weights_mod
```
Imported from OSS
Reviewed By: hx89
Differential Revision: D27192844
fbshipit-source-id: 923cf63e29e4638fd77ca42e69aedb15fb20a330
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54326
Fixes unshadowed activation input logging for subgraphs where start_node does
not equal end_node. In detail:
* instead of passing around a single list of nodes, pass around a list
of nodes to instrument inputs, and a list of nodes to instrument
outputs. This way we can handle multi-node subgraphs properly, and we
also keep the subgraph instance definition out of the public APIs.
* add a test case
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_linear_fp16_activations
```
Imported from OSS
Reviewed By: hx89
Differential Revision: D27190138
fbshipit-source-id: 58e2377c1c128baaf3b760c1ad29098fb21f53d3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54275
Adds support for NS shadow activations path for the fp16 emulation
pattern such as
```
... -> dequantize -> linear -> relu -> to(torch.float16) -> ...
```
There are a couple of changes necessary here:
1. removing the restriction on the shadowing graph pass that the B
subgraph is a single node (since this subgraph is four nodes), and
modifying the code to correctly add the relevant inputs versus output
loggers (input loggers and subgraph copy if we are at start_node,
and output logger if we are at end_node)
2. modifying the logic for calculating node input and output type
to work correcty for the `to` and `dequantize` nodes:
2a. make the function return the first input and output, instead of just
the first input
2b. make the function handle `dequantize` correctly by recursively
using the output if its input
2c. make the function handle `to` correctyl by recursively using the
output of its input and the target dtype
3. a bug fix to handle observers in kwargs, while copying subgraphs
Note: input logging for these patterns is not tested yet,
this will be in the next PR.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_linear_fp16
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27172655
fbshipit-source-id: 3bdc86618b2a5782627fcf303d58af7f47fbc30d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54644
Previously we special case copy operator in normal insert observer code, this PR tries to split the
special case logic to a separate function and keep the rest of the code clean.
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D27314678
fbshipit-source-id: d36870ceb3717bc01eaeaa6f3f1532ad562cbaf1
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53586
Previously one value can only be quantized to one dtype, this PR adds the support for quantizing one value
in the fx graph with multiple dtypes, e.g. first quantize to int8 and then float16
might do some followup PRs to clean up the hacks and refactor the code.
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_multiple_qconfigs_single_value
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D26912676
fbshipit-source-id: ae3653fd67f05870a3a9e808f491871826c555d5
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54860
Currently we insert a quantize_per_tensor op when we encounter the quantizable input,
so if it has multiple uses and not all are quantizable then we need to add a dequantize op
before these ops.
In this pass - For a sequence of quantize_per_tensor - dequantize, we combine them
since it is a no-op.
[internal only][pyper]
Before this change we had redundant dequantize nodes in the graph
Example 1x inline_cvr graph https://www.internalfb.com/intern/everpaste/?handle=GODBxAlUMzGHD6 (98143776f5)MSACpHKKu9qjorbsIXAAAz
FC layers -> 37
quantize_per_tensor -> 30
dequantize -> 49
After this change
https://www.internalfb.com/intern/everpaste/?handle=GAl0uQnOlDNmpLoSAB-GZqRxu9wMbsIXAAAz
FC layers -> 37
quantize_per_tensor -> 30
dequantize -> 39
We remove extra 10 dequantize nodes in the graph.
Test Plan:
python test/test_quantization.py test_fold_quant_dequant
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D27390506
fbshipit-source-id: 56e6fb8496171246eccf4bd45eb8bebd87fcb740
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54859
This is applicable to the case when a call_function linear op is one of the users of quantize op
In order to be able to map the qparams of quantize_per_tensor to the qparams of the linear operator
that consumes it, we need to use the FQN of the module with linear op for the qparmas of quantize_per_tensor.
Test Plan:
python test/test_quantization.py test_qparams_fqn
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27390505
fbshipit-source-id: a47af0e5ac016f2b2df74fbdf45afe99dc04be46
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54624
previously we were creating setattr nodes for dtype and axis.
The FX convention is that primitive types are embedded as literals in args/kwargs.
With this change we won't see getattr nodes in the graph anymore for dtype/axis
Test Plan:
python test/test_quantization.py TestQuantizeFx
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27306898
fbshipit-source-id: a7c91c7cb21ee96015c7f8830b38d943ada65358
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54640
If we are running constant propagation on a graph that doesn't have any operators with constant inputs and any mutable inputs/outputs, we do not need to initialize an alias db. This is going to be used to speed up symbolic shape analysis.
Test Plan: Imported from OSS
Reviewed By: nikithamalgifb
Differential Revision: D27340863
Pulled By: eellison
fbshipit-source-id: 087b2a33b42c58fa5dae405d652b056d0f1d72e7
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54257
Makes the NS weight extraction fuction work correctly with
fp16 emulation patterns for linear. We navigate to the
weight correctly, and cast it to `torch.float16` before returning.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_linear_fp16
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27159370
fbshipit-source-id: 95f555298e3153e4783c64b3d8c83b9d3fdffa12
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54254
In fp16 emulation, we now have patterns such as
```
... -> dequantize -> linear -> relu -> to(torch.float16) -> ...
```
This PR adds support for
* specifying a subgraph's "base_op_node", which is the node with the op
which should be matched to related nodes. In the example above,
"base_op_node" would be the linear node, and it would be the second
node in the matched pattern.
* matching these fusion patterns and properly setting "base_op_node"
based on pattern and index
* using "base_op_node" instead of "start_node" throughout the NS
codebase wherever the intent is to match subgraphs or create names
for subgraphs.
At the end of this PR, matching unshadowed activations with an example
fp16 emulation pattern works e2e.
I'm saving the following work for future PRs (soon), mostly to keep
PR size manageable:
* adding weight matching (will require some changes to function which
extracts weights)
* adding shadowed activation matching (will require some changes to
shadow copying)
* adding input logging for these patterns (will likely require some changes as well)
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_linear_fp16
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27158199
fbshipit-source-id: 49fc445395452fda62e3c7a243544190f9af691c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54253
Creates an `NSSubgraph` type for representing a subgraph instance,
and modifies the NS code to use it. This will enable us to add
more information to the subgraph instance definition without
having to change all the callsites.
Test Plan:
```
mypy torch/quantization
python test/test_quantization.py TestFXGraphMatcher
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27158198
fbshipit-source-id: 548785dd90144e2da256c23af990620c778e7cfe
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53828
Moves LSTM shadow activations test to new API. In order
to enable this, adds support for passing two args instead
of one arg when copying a subgraph from A to B.
Since this was the last test of the old API, deletes
the old test case.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels.test_compare_shadow_activations_lstm_dynamic
```
Imported from OSS
Reviewed By: hx89
Differential Revision: D26982733
fbshipit-source-id: 03f580688dd37f3ccd688d9f444e9e79cfa84734
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53819
Moves the linear tests for shadow activations to new API.
In order to do so, adds logic for fp32 to fp32 dtype cast,
which is an identity.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels.test_compare_shadow_activations_linear
```
Imported from OSS
Reviewed By: hx89
Differential Revision: D26982734
fbshipit-source-id: b6203228abf3cdf74ab0638468a6df77658aa662
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53818
Moves testing of conv for shadow activations to new NS API
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels.test_compare_shadow_activations_conv
```
Imported from OSS
Reviewed By: hx89
Differential Revision: D26982732
fbshipit-source-id: 9e8709a76363fbcdf84413e5d4a6c8a0889cb97b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53779
Moves the test case for LSTM activation matching to new NS APIs.
This requires adding the ability to log non-Tensor types.
Since we need Loggers to be scriptable and TorchScript does
not support `Union`, we collect statistics in a separate collector
if we have an RNN. Note: this can scale to a small N of
return types, but not to a large N. If the N becomes large in
the future, we will solve it then.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels
```
Imported from OSS
Reviewed By: hx89
Differential Revision: D26967110
fbshipit-source-id: afe60b44fdec28a328813b4f342cf4fe04820baa
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54654
Fixes a bug where disabling quantizaton on potential fusion patterns
would lead to errors in the `convert` function. For example:
1. have a model with add-relu
2. disable quantization for the part of the model containing add-relu
3. run prepare and convert, the convert step would fail because
intermediate nodes were missing from `env`.
The fix is to add handling for this edge case. If quantization is
disabled, we manually copy the nodes for multi-node fusion patterns.
Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_fusion_pattern_unquantized
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27318454
fbshipit-source-id: 27c1fd1cb7c9711a8e8d338200971c428dae8f98
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53772
Moves the test case for extracting LSTM dynamic weights to new NS API.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels.test_compare_weights_lstm_dynamic
```
Imported from OSS
Reviewed By: hx89
Differential Revision: D26967104
fbshipit-source-id: 0d17e7735ec361167dcf72bcb373bfc1aad84668
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53765
Moves linear dynamic weight test case to new NS API.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels.test_compare_weights_linear
```
Imported from OSS
Reviewed By: hx89
Differential Revision: D26967109
fbshipit-source-id: 2096a88a3005270696d536f2e1bbc87e70c07230
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53764
Moving the linear weight test case to new FX NS APIs.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels.test_compare_weights_linear
```
Imported from OSS
Reviewed By: hx89
Differential Revision: D26967111
fbshipit-source-id: f0a90d7863d5d866e391729ec28e0e0dea339900
Summary:
This PR implements the option to log inputs for FX Numeric Suite. The user facing api looks like
```
def prepare_model_outputs(..., should_log_inputs : bool = False)
def prepare_model_with_stubs(..., should_log_inputs : bool = False)
```
The output data now looks like
```
{
"layer1": {
"node_inputs": {
"model1": [{
"values": ...,
...,
}],
},
"node_outputs": {
...,
}
},
... // other layers
}
```
One key design decision taken here is that an input logger logs the output of previous nodes, instead of logging the input of the current node. This matters for a signature such as `cat([x1, x2, x3])`. We are inserting three input loggers here (for x1, x2, and x3), instead of a single input logger for `[x1, x2, x3]`. This was chosen in order to preserve the structure of the original graph as much as possible and keep flexibility for future optimizations.
Test Plan:
TODO: fill out
Imported from OSS
Differential Revision: D26931225
Reviewed By: hx89
Pulled By: vkuzo
fbshipit-source-id: dd692bfb5ddaaf5554f80c25e2f40b21762e4fc3
Summary:
This PR ensures that when we do a dtype cast for a shadow module,
we insert N dtype casts for N nodes, instead of combining N nodes
into a single dtype cast.
An example where this occurs is `cat([x, y], dim=0)`
```
// original graph
[x, y] -> cat_b -> output
// shadow graph with a single dtype cast, before this PR
dtype_cast -> cat_a_shadow -> output_a_shadow
/
[x, y] -> cat_b -> output_b
// shadow graph with multiple dtype casts, after this PR
[dtype_cast_x, dtype_cast_y] -> cat_a_shadow -> output_a_shadow
/
[x, y] -> cat_b -> output_b
```
The reason things worked before this PR is because `torch.dequantize`
can take either a single tensor or a list of tensors. We are changing
this to make an upcoming addition of input loggers easier.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_prepare_model_with_stubs_multiple_dtype_casts
```
Imported from OSS
Differential Revision: D26931226
Reviewed By: hx89
Pulled By: vkuzo
fbshipit-source-id: e9c7d4c7942e0f59c952094d2e446b1e2c838396
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53075
The input and output types should be `nn.Module`, to hide
the implementation detail that the pass is using FX.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D26740548
fbshipit-source-id: d5ed445379355bebdd90d377c95fcd7e671371a3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53614
Ensures that every subclass of `QuantizeHandler` has a clear name. This
prevents ambiguous names like `Cat`, which look like a module but are
really a quantize handler.
Test Plan:
```
python test/test_quantization.py TestQuantizeFx
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D26914784
fbshipit-source-id: 6dca7e27975c09f422f8e36f1d2b709bf3eaaadf
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53196
Before this PR, code patterns like this did not work:
```
x = some_quant_layer(x)
x = torch.stack([x, ...])
x = torch.sum(x, ...)
```
The reason this did not work is because `torch.sum` is treated as
"quantized" because of the newly added fp16 support, even though it is
not actually "quantized" for models where fp16 is not used. We may
need to adjust the concept of "quantized vs non-quantized" into a
"dtype" for the longer term fix.
The current PR is a hacky fix to unblock. We need to clean things
up before this is landable
Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_quant_sum
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D26783960
fbshipit-source-id: 3be7c3c1eaa2b8fcb99a105e1b0004c9ffd3a1c1
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53187
Before this diff, if we had code lik
```
x = any_quant_layer(...)
x_size0 = x.size(0)
torch._assert(x_size_0 == 1)
```
The convert code would try to insert a dequantize after `x_size0`,
because it was a descendant of a quantized node and it was needed
for a non-quantized operation. Since the actual type of the `size`
function output is an integer, this does not make sense.
For now, this is fixed as a one-off to unblock a customer. In the
future, we may need to think more deeply about all the functions which
can return non-quantized types from quantized tensors and make sure
they are all covered.
Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_assert_on_size_after_quant_layer
```
Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D26780690
fbshipit-source-id: 44cc25c9179d460efb3f110d40b73d854d676af5
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53120
Currently there is a pattern which is not handled correctly by
FX graph mode quantization:
```
def forward(self, x):
ndim = x.ndim
# or add, mul, div, etc
x = torch.sub(x, ndim)
return x
```
The reason this does not work is as follows:
1. x.ndim becomes a getattr node
2. the real world type of x.ndim is an integer, but this is not known from the graph (yet)
3. binary ops such as `torch.sub` require quantization of inputs
4. the framework inserts an observer to observe the output of `ndim`
5. the observer fails because `ndim` is not a Tensor
For now, we hack a bandaid to unblock some teams, none of this is for
land. We will have to think of a better fix which is landable (TBD).
Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_getattr_with_nontensor_result
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D26756180
fbshipit-source-id: c0e498766b22c23df74fbb5aaeaa237c4c944263
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53585
Previously fp16_static CopyNode would be marked as unquantized because of
an incorrect condition check of whether a Node is statically quantized or not.
This PR fixes that.
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D26912677
fbshipit-source-id: 4ddb538714c5ba2db28430de5e1cf2931baf1993
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53330
Fixed a condition check for fixed qparam ops, previously we were including CopyNodes as well
Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_fixed_qparams_ops_fp16
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D26836867
fbshipit-source-id: 8c486155244f852e675a938c3f4237f26505671c
Summary:
Fixes https://github.com/pytorch/pytorch/issues/50002
The last commit adds tests for 3d conv with the `SubModelFusion` and `SubModelWithoutFusion` classes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50003
Reviewed By: mrshenli
Differential Revision: D26325953
Pulled By: jerryzh168
fbshipit-source-id: 7406dd2721c0c4df477044d1b54a6c5e128a9034
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53166
Context: For fx modules that consist of scriptmodules, calling
delattr(module, 'qconfig') throws an attribute error. will follow up
with a separate issue/repro to fix this problem
This PR adds a temporary flag to convert_fx API to preserve the qconfig attributes on the converted model
We will remove this flag once we reach a conclusion on calling delattr on scriptmodules
Test Plan:
python test/test_quantization.py test_preserve_qconfig
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D26771518
fbshipit-source-id: 9fd72816576856ffb4aa11f8fde08303d1df10a2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52928
Changes the user facing API of `prepare_single_model_output` to
require a list of nodes instead of a list of subgraphs. This ensures
that how we define a subgraph is an implementation detail and is
not exposed to the user, keeping the eng cost of updating this
implementation later low.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
```
Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D26693471
fbshipit-source-id: 67c2feb844556225e36f8d6d4023246939bcb445
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52927
Refactor to use an existing util instead of duplicating code, no logic
change.
Test Plan:
CI
Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D26693474
fbshipit-source-id: 06b7047eb9a762557b7f679347e424c0dd009aad
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52926
Model name is already stored in the Loggers in the prepare call.
Removing the need to specify it again in the extract activations
functions, to simplify things.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
```
Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D26693473
fbshipit-source-id: 52511cacc16f79fa09c78ccde78e7f439f4b315c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52925
Cleans up some incorrect comments and docblocks in
`numeric_suite_core_apis.py`.
Test Plan:
CI
Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D26693472
fbshipit-source-id: 17f3ff464c6ea01374bcc6ac5899da7034627152
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52651
Merging them for easier extensions to fp16 and more binary ops
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D26600118
fbshipit-source-id: a1816e593cf3065afe87d2e6e44cdace13bf6aeb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52534
Currently linear_dynamic_fp16 has a signature that's tied to fbgemm/qnnpack
We'll need to produce a pattern equivalent to linear_dynamic_fp16 to support extensions
to other backends
Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_linear_dynamic_fp16
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D26557726
fbshipit-source-id: 270c9f781f73c79416a092b7831294cabca84b0c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52799
We agreed that it's better to not add this, removing.
We can make Eager mode NS match this in a future PR.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
```
Imported from OSS
Reviewed By: hx89
Differential Revision: D26652638
fbshipit-source-id: 5baa51a6bf6de5632946417fe9fd3d0f3e78f7fa
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52798
Adds the node name and node target type to Numerical Suite outputs.
This is useful to debug which node got matched to which node,
and what is the type of the operation.
```
// before
{
layer_name: {
model_name: {
'type': 'weight',
'values': [...],
},
},
}
// after
{
layer_name: {
model_name: {
'type': 'weight',
'values': [...],
'node_name': '0',
'node_target_type': "<class 'torch.nn.modules.conv.Conv2d'>",
},
},
}
```
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
```
Imported from OSS
Reviewed By: hx89
Differential Revision: D26652637
fbshipit-source-id: ba75b110cb91234f17a926ccbc5d0ccee2c3faeb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52779
1. makes the return type of the weight comparison APIs match the return
type of the activation comparison APIs:
```
# before
{layer_name: {model_name: weight_tensor}}
{layer_name: {model_name: [activation_tensor]}}
# after
{layer_name: {model_name: [weight_tensor]}}
{layer_name: {model_name: [activation_tensor]}}
```
2. makes a type alias for the type, so future changes are easier
Test Plan:
```
mypy torch/quantization
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
```
Imported from OSS
Reviewed By: hx89
Differential Revision: D26652639
fbshipit-source-id: eb1f04d6913cedf88d628f362468875ae9ced928