Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52092
Adds a very simple toy sparsenn model, and enables
its inspection with the new NS APIs.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_sparsenn_compare_activations
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_sparsenn_shadow
```
Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D26403095
fbshipit-source-id: 3c3650aca47186deb32f2b3f1d87a0716d1ad9d1
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52302
Adds the basic functionality for the three Numeric Suite core APIs to work on FX models:
1. comparing weights
2. comparing activations, with same input fed to both models
3. comparing activations, with nodes of A shadowing nodes of B
Note: there are a lot of TODOs in the code, and some/most of the APIs and implementation details may change as we iterate. This is just the first PR.
Test Plan:
We have unit test coverage for all of the APIs, for now this is with toy models:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
```
Reviewed By: raghuramank100
Differential Revision: D26463013
Pulled By: vkuzo
fbshipit-source-id: e454115099ad18e4037d3c54986951cdffcab367
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51669
Adds the basic functionality for the three Numeric Suite core APIs to work on FX models:
1. comparing weights
2. comparing activations, with same input fed to both models
3. comparing activations, with nodes of A shadowing nodes of B
Note: there are a lot of TODOs in the code, and some/most of the APIs and implementation details may change as we iterate. This is just the first PR.
Test Plan:
We have unit test coverage for all of the APIs, for now this is with toy models:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
```
Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D26403094
fbshipit-source-id: 9752331d4ae0105346d3da309b13c895b593b450
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51588
Early version of utility to match nodes between graph A and graph B, for Numerical Suite for FX graph mode quantization.
The main goal of this utility is to reliably match the nodes of graph A to the nodes of graph B, and throw an easy to read error message. This will be used in future PRs to create the APIs for matching activations. It also could potentially be used to match weights.
Test Plan:
For now, we have bare bones test coverage on some toy models, and a single torchvision model.
```
python test/test_quantization.py TestFXGraphMatcher
```
Future PRs will add more testing.
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D26403093
fbshipit-source-id: 60e318d51e6fefe65265488c4967629d946048ef
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50748
Adds support for Linear + BatchNorm1d fusion to quantization.
This is a redo of dreiss's https://github.com/pytorch/pytorch/pull/37467, faster
to copy-paste it than rebase and deal with conflicts.
Test Plan:
```
python test/test_quantization.py TestFusion.test_fusion_linear_bn_eval
```
Imported from OSS
Reviewed By: supriyar
Differential Revision: D25957432
fbshipit-source-id: 24e5b760f70186aa953ef65ab0182770e89495e4
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49623
(not ready for review)
Ensures that conv bias is not observed in a `F.conv{n}d` call.
Test Plan: Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D25652856
fbshipit-source-id: 884f87be1948d3e049a557d79bec3c90aec34340
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49428
Previously dequantstub will be swapped with nn.quantized.DeQuantize regardless of qconfig
reason is we skipped attaching qconfig for DeQuantStub to avoid adding fake quantize module to it
but the correct fix is to skip it in insert observers, this PR fixes the issue.
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D25569991
fbshipit-source-id: d44a08c6e64c7a49509687dc389b57de1cbb878c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49420
Before: if an output was marked as quantized, it could actually not
be quantized, if the previous node was not quantized.
After: if an output was marked as quantized, it will be quantized
regardless of the quantization status of the previous node.
Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps.test_quant_output_always_observed
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D25566834
fbshipit-source-id: 84755a1605fd3847edd03a7887ab9f635498c05c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48939
Add numerical test for fx graph mode for resnet base, comparing with eager mode
Test Plan: Imported from OSS
Reviewed By: supriyar
Differential Revision: D25375342
fbshipit-source-id: 08f49b88daede47d44ee2ea96a02999fea246cb2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48069
also renamed float_qparam_dynamic_qconfig to float_qparam_weight_only_qconfig
It's not used in user code yet so we only need to update the tests.
Test Plan: Imported from OSS
Reviewed By: supriyar
Differential Revision: D25010175
fbshipit-source-id: caa3eaa5358a8bc5c808bf5f64e6ebff3e0b61e8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48038
nn.ReLU works for both float and quantized input, we don't want to define an nn.quantized.ReLU
that does the same thing as nn.ReLU, similarly for nn.quantized.functional.relu
this also removes the numerical inconsistency for models quantizes nn.ReLU independently in qat mode
Test Plan:
Imported from OSS
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D25000462
fbshipit-source-id: e3609a3ae4a3476a42f61276619033054194a0d2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47415
nn.ReLU works for both float and quantized input, we don't want to define an nn.quantized.ReLU
that does the same thing as nn.ReLU, similarly for nn.quantized.functional.relu
this also removes the numerical inconsistency for models quantizes nn.ReLU independently in qat mode
Test Plan: Imported from OSS
Reviewed By: z-a-f
Differential Revision: D24747035
fbshipit-source-id: b8fdf13e513a0d5f0c4c6c9835635bdf9fdc2769
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46955
Initially we were thinking of adding a `invalidate_quantized_float_parameters` option to free the memory
of quantized floating parameters, but it turns out we will do module swap just like in eager mode for the modules
that are quantized, so the old floating point module will not be referenced after quantization. therefore this feature
is only needed for functionals, since most people are using quantization with modules we may not need this.
we'll revisit after we find there is a need for this.
Test Plan: Imported from OSS
Reviewed By: supriyar
Differential Revision: D24579400
fbshipit-source-id: fbb0e567405dc0604a2089fc001573affdade986
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46337
We plan to pass around the mappings instead of using global registration api to keep
the mappings local to the transformations user is performing
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D24317436
fbshipit-source-id: 81569b88f05eeeaa9595447e482a12827aeb961f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45672
This PR merges all quantization mode and will only expose the following top level functions:
```
prepare_fx
prepare_qat_fx
convert_fx
```
Test Plan:
Imported from OSS
Imported from OSS
Reviewed By: z-a-f
Differential Revision: D24053439
fbshipit-source-id: 03d545e26a36bc22a73349061b751eeb35171e64
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45292
This PR merges all quantization mode and will only expose the following top level functions:
```
prepare_fx
prepare_qat_fx
convert_fx
```
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D23913105
fbshipit-source-id: 4e335286d6de225839daf51d1df54322d52d68e5
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44846
The save function traverses the model state dict to pick out the observer stats
load function traverse the module hierarchy to load the state dict into module attributes depending on observer type
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_save_observer_state_dict
Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D23746821
fbshipit-source-id: 05c571b62949a2833602d736a81924d77e7ade55
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44835
This is for feature parity with fx graph mode quantization
Test Plan: Imported from OSS
Reviewed By: z-a-f
Differential Revision: D23745086
fbshipit-source-id: ae2fc86129f9896d5a9039b73006a4da15821307
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44217
Move the tests to static ones as well
Test Plan:
python test/test_quantization.py TestStaticQuantizedModule.test_embedding_bag_api
Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D23547386
fbshipit-source-id: 41f81c31e1613098ecf6a7eff601c7dcd4b09c76
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44208
Add quantized module in static quantization namespace. Embedding
quantization requires only weights to be quantized so it is static.
Internally it calls the embedding_bag_byte op with the offsets set corresponding to the
indices.
Future PR will move EmbeddingBag quantization from dynamic to static as well.
Test Plan:
python test/test_quantization.py test_embedding_api
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D23547384
fbshipit-source-id: eddc6fb144b4a771060e7bab5853656ccb4443f0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44092
instead submodules and weights are installed directly on the
graph_module by transferring the original modules. This makes it more
likely that scripting will succeed (since we no longer have submodules
that are not used in the trace). It also prevents layered transforms
from having to special case handling of the `root` module. GraphModules
can now be re-traced as part of the input to other transforms.
Test Plan: Imported from OSS
Reviewed By: jamesr66a
Differential Revision: D23504210
Pulled By: zdevito
fbshipit-source-id: f79e5c4cbfc52eb0ffb5d6ed89b37ce35a7dc467
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43902
Trace back from the weight node util we hit getattr, reconstruct the graph module with the traced nodes
and run the graph module to pack the weight. then replace the original chain of ops with the packed weight.
Test Plan:
Imported from OSS
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D23432431
fbshipit-source-id: 657f21a8287494f7f87687a9d618ca46376d3aa3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43901
Add similar APIs like eager and graph mode on torchscript
- fuse_fx
- quantize_fx (for both post training static and qat)
- quantize_dynamic_fx (for post training dynamic)
- prepare_fx (for both post training static and qat)
- prepare_dynamic_fx (for post training dynamic)
- convert_fx (for all modes)
Test Plan:
Imported from OSS
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D23432430
fbshipit-source-id: fc99eb75cbecd6ee7a3aa6c8ec71cd499ff7e3c1
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43728
Trace back from the weight node util we hit getattr, reconstruct the graph module with the traced nodes
and run the graph module to pack the weight. then replace the original chain of ops with the packed weight.
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D23385090
fbshipit-source-id: 11341f0af525a02ecec36f163a9cd35dee3744a1
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43581
Add similar APIs like eager and graph mode on torchscript
- fuse_fx
- quantize_fx (for both post training static and qat)
- quantize_dynamic_fx (for post training dynamic)
- prepare_fx (for both post training static and qat)
- prepare_dynamic_fx (for post training dynamic)
- convert_fx (for all modes)
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D23385091
fbshipit-source-id: b789e54e1a0f3af6b026fd568281984e253e0433
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43587
Add tests for graph mode quantization on torchvision and make sure it matches
current eager mode quantization
Test Plan:
Imported from OSS
Imported from OSS
Reviewed By: z-a-f
Differential Revision: D23331253
fbshipit-source-id: 0445a44145d99837a2c975684cd0a0b7d965c8f9
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43526
Add tests for graph mode quantization on torchvision and make sure it matches
current eager mode quantization
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D23306683
fbshipit-source-id: 30d27e225d4557bfc1d9aa462086e416aa9a9c0e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43445
changed the interface for checkGraphModule to make the arguments more explicit
as requested in https://github.com/pytorch/pytorch/pull/43437
Test Plan:
TestQuantizeFx
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D23280586
fbshipit-source-id: 5b5859e326d149a5aacb1d15cbeee69667cc9109