Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70246
Breaks up the large `TestQuantizeDBR` test case into
1. `TestQuantizeDBRIndividualOps` for testing functionality of ops
2. `TestQuantizeDBRMultipleOps` for testing non-fusion interactions between ops
3. `TestQuantizeDBR` for everything else
We may need to refactor this more in the future, but this should
unblock things for the near future.
Test Plan:
```
python test/test_quantization.py TestQuantizeDBR
python test/test_quantization.py TestQuantizeDBRIndividualOps
python test/test_quantization.py TestQuantizeDBRMultipleOps
```
Reviewed By: jerryzh168
Differential Revision: D33255925
Pulled By: vkuzo
fbshipit-source-id: 82db1a644867e9303453cfedffed2d81d083c9cd
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69880
Making the test cases more standardized, in general we would like to have
```
TestQuantizeEager,
TestQuantizeEagerOps,
TestQuantizeEagerModels,
```
but currently since we have separate ptq static, ptq dynamic and qat static apis, we only partially cleaned
up the test cases, we can merge all of them later when we merge all the apis
Test Plan:
python test/test_quantization.py
Imported from OSS
Reviewed By: supriyar
Differential Revision: D33081418
fbshipit-source-id: fcb96559b76bbc51eb1b0625e0d4b193dbb37532
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68228
Forking this for now so that we can make changes as we need, the changes can be merged back to torch.fx
later
Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
```
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D32537713
fbshipit-source-id: 326598d13645fcc28ef2c66baaac6a077b80fd0c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68602
This PR adds support for configuring weight/bias dtype in backend_config_dict
and refactor the current code that checks when to insert observers
Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
```
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D32537712
fbshipit-source-id: 28eb7c61a8dcad8c1f3f6622d490a34cff0c59e2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68176
it should be noted that for the modules, reduce_range is set to
true by default in a similar fashion to linear_dynamic.
Test Plan:
python test/test_quantization.py TestDynamicQuantizedModule
python test/test_quantization.py TestDynamicQuantizedConv
python test/test_quantization.py TestQuantizedConv
Imported from OSS
Reviewed By: kimishpatel
Differential Revision: D32374003
fbshipit-source-id: 011562bd0f4d817387d53bb113df2600aa60a7a3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64676
We implement a working eager mode quantization flow which uses
tracing and `__torch_function__` and `torch.nn.Module.__call__` overrides to automate the model modifications needed for quantization. Partial program capture (instead of full program capture) is used, allowing this scheme to target a wide variety of user programs. Control flow over quantizeable ops is not supported, but general control flow is supported.
In particular:
* `auto_trace.py` contains the machinery to override `__torch_function__` and `torch.nn.Module.__call__` and call hooks before and after each quantizeable module or function
* `quantization_state.py` contains the state needed to use the hooks to implement quantization logic such as adding quants/dequants, observers, etc.
* please see `README.md` for more details
Test Plan:
```
python test/test_quantization.py TestAutoTracing
python test/test_quantization.py TestAutoTracingModels
```
```
python test/test_quantization.py TestAutoTracing
python test/test_quantization.py TestAutoTracingModels
```
Differential Revision:
D31992281
D31992281
Reviewed By: HDCharles
Pulled By: vkuzo
fbshipit-source-id: 6d40e855f3c96b9a4b637a0e677388a7b92f7967
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66925
Current convert_fx implementation is using "The Interpreter Pattern" in https://pytorch.org/docs/stable/fx.html
There are two things that's changed which make the approach in this PR possible and needed:
1). original convert implementation is developed at the initial prototype where fx does not allow mutations, now fx
supports mutations
2). original convert needs to work for a lot of fbgemm/qnnpack specific logic, which is not needed for reference patterns
Therefore it makes sense for us to write a new convert function just for reference patterns, the implementation
is significantly easier to understand than the original convert implementation
Current support:
* we should be able to support all non-weighted ops like relu, add etc.
Missing:
* linear and conv
* some advanced features like standalone modules, input_quantized_idxs etc.
will add linear and conv support and start defining the backend_config_dict based on this version of convert
Test Plan:
python test/test_quantization.py TestQuantizeFxOpsNew
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D31786241
fbshipit-source-id: 2a32156eb6d3c5271cb44906cd863055785fb5d4
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65538
Adds a test which verifies that `prepare_fx` and `convert_fx` work
on models created by `torch.package` in the past. In detail:
1. (one time) create a model and save it with torch.package. Also save input,
expected output, and names of quantization related get_attrs added by
our passes.
2. (every time) load the model from (1), and verify that expected output
matches current output, and that get_attr targets did not change.
Test Plan:
```
python test/test_quantization.py TestSerialization.test_linear_relu_package_quantization_transforms
```
Imported from OSS
Reviewed By: supriyar
Differential Revision: D31512939
fbshipit-source-id: 718ad5fb66e09b6b31796ebe0dc698186e9a659f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65033
1. Move the file:
```
hg mv caffe2/torch/quantization/fx caffe2/torch/ao/quantization/fx
hg mv caffe2/torch/quantization/quantize_fx.py caffe2/torch/ao/quantization/quantize_fx.py
```
2. Create new files
```
touch caffe2/torch/quantization/quantize_fx.py
touch caffe2/torch/quantization/fx/__init__.py
```
3. import things in the new files
4. add tests to test/quantization/ao_migration/test_quantization_fx.py
this is because we have some fx import in quantize_fx and fx/*.py
Test Plan: buck test mode/dev //caffe2/test:quantization
Reviewed By: vkuzo, z-a-f
Differential Revision: D30949749
fbshipit-source-id: 9e5d4d039c8a0a0820bc9040e224f0d2c26886d3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64912
The test naming was confusing and ambiguous. The file was changed to reflect the framework that is being migrated ("quantization" instead of "quantize"). Also, the common testing class was extracted out
ghstack-source-id: 138157450
Test Plan: `buck test mode/dev //caffe2/test:quantization -- TestAOMigrationQuantization`
Reviewed By: vkuzo
Differential Revision: D30898214
fbshipit-source-id: 017f95995271d35bcdf6ff6a1b3974b837543e84
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64445
AO Team is migrating the existing torch.quantization into torch.ao.quantization. We are doing it one file at a time to make sure that the internal callsites are updated properly.
This migrates the quantize.py from torch.quantization to torch.ao.quantization.
At this point both locations will be supported. Eventually the torch.quantization will be deprecated.
Test Plan: `buck test mode/dev //caffe2/test:quantization`
Reviewed By: HDCharles
Differential Revision: D30734870
fbshipit-source-id: dc204f3cc46bff2cc81c95159eab9d333b43bb4b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64086
AO Team is migrating the existing torch.quantization into torch.ao.quantization. We are doing it one file at a time to make sure that the internal callsites are updated properly.
This migrates the `quantize.py` from torch.quantization to `torch.ao.quantization`.
At this point both locations will be supported. Eventually the torch.quantization will be deprecated.
Test Plan: `buck test mode/opt //caffe2/test:quantization`
Reviewed By: jerryzh168, raghuramank100
Differential Revision: D30055886
fbshipit-source-id: 8ef7470f9fa640c0042bef5bb843e7a05ecd0b9f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61570
Fused operator that computes moving average min/max values (in-place) of the input tensor and fake-quantizes it.
It expects the qmin/qmax values to reflect the range of the quantized tensor (instead of reduce_range)
Motivation for adding this operator is for performance reasons, since moving the computation from python to C++/CUDA can increase the performance of QAT.
Test Plan:
python test/test_quantization.py TestFusedObsFakeQuant
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D29682762
fbshipit-source-id: 28e4c50e77236d6976fe4b326c9a12103ed95840
Summary: Implemented two observers (InputEqualObserver and WeightEqualObserver) which will be inserted into the graph during prepare_fx().
Test Plan: python test/test_quantization.py TestEqualizeFx
Reviewed By: supriyar
Differential Revision: D28836954
fbshipit-source-id: 25517dc82ae67698ed8b2dc334e3323286976104
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59088
Clean up comments and organize the tests better
Test Plan:
python test/test_quantization.py
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D28750064
fbshipit-source-id: 4c36922e25e3adea3aaa8b4d9185dc28b17aa57c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59007
Create folders for each test category and move the tests.
Will follow-up with a cleanup of test_quantization.py
Test Plan:
python test/test_quantization.py
Imported from OSS
Reviewed By: HDCharles
Differential Revision: D28718742
fbshipit-source-id: 4c2dbbf36db35d289df9708565b7e88e2381ff04
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59000
These tests span both QAT and PTQ APIs so factor them out
Test Plan:
python test/test_quantization.py TestModelNumericsEager
Imported from OSS
Reviewed By: HDCharles
Differential Revision: D28713910
fbshipit-source-id: b2ad27cf59abb7cc0c4e4da705f8c9220410f8ad
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58999
Rename the test files to be more explicit that they are for eager mode
Test Plan:
python test/test_quantization.py
Imported from OSS
Reviewed By: HDCharles
Differential Revision: D28713909
fbshipit-source-id: b4ccd06c841fe96edf8c065a0bceae15fed260f9
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58963
some tests are used to check the op level numerics of the fake quantize operations
Test Plan:
python test/test_quantization.py
Imported from OSS
Reviewed By: HDCharles
Differential Revision: D28696599
fbshipit-source-id: 98f9b0c993dd43050176125461ddd5288142989b
Summary:
As this diff shows, currently there are a couple hundred instances of raw `noqa` in the codebase, which just ignore all errors on a given line. That isn't great, so this PR changes all existing instances of that antipattern to qualify the `noqa` with respect to a specific error code, and adds a lint to prevent more of this from happening in the future.
Interestingly, some of the examples the `noqa` lint catches are genuine attempts to qualify the `noqa` with a specific error code, such as these two:
```
test/jit/test_misc.py:27: print(f"{hello + ' ' + test}, I'm a {test}") # noqa E999
test/jit/test_misc.py:28: print(f"format blank") # noqa F541
```
However, those are still wrong because they are [missing a colon](https://flake8.pycqa.org/en/3.9.1/user/violations.html#in-line-ignoring-errors), which actually causes the error code to be completely ignored:
- If you change them to anything else, the warnings will still be suppressed.
- If you add the necessary colons then it is revealed that `E261` was also being suppressed, unintentionally:
```
test/jit/test_misc.py:27:57: E261 at least two spaces before inline comment
test/jit/test_misc.py:28:35: E261 at least two spaces before inline comment
```
I did try using [flake8-noqa](https://pypi.org/project/flake8-noqa/) instead of a custom `git grep` lint, but it didn't seem to work. This PR is definitely missing some of the functionality that flake8-noqa is supposed to provide, though, so if someone can figure out how to use it, we should do that instead.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56272
Test Plan:
CI should pass on the tip of this PR, and we know that the lint works because the following CI run (before this PR was finished) failed:
- https://github.com/pytorch/pytorch/runs/2365189927
Reviewed By: janeyx99
Differential Revision: D27830127
Pulled By: samestep
fbshipit-source-id: d6dcf4f945ebd18cd76c46a07f3b408296864fcb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53828
Moves LSTM shadow activations test to new API. In order
to enable this, adds support for passing two args instead
of one arg when copying a subgraph from A to B.
Since this was the last test of the old API, deletes
the old test case.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels.test_compare_shadow_activations_lstm_dynamic
```
Imported from OSS
Reviewed By: hx89
Differential Revision: D26982733
fbshipit-source-id: 03f580688dd37f3ccd688d9f444e9e79cfa84734
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52092
Adds a very simple toy sparsenn model, and enables
its inspection with the new NS APIs.
Test Plan:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_sparsenn_compare_activations
python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_sparsenn_shadow
```
Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D26403095
fbshipit-source-id: 3c3650aca47186deb32f2b3f1d87a0716d1ad9d1
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52302
Adds the basic functionality for the three Numeric Suite core APIs to work on FX models:
1. comparing weights
2. comparing activations, with same input fed to both models
3. comparing activations, with nodes of A shadowing nodes of B
Note: there are a lot of TODOs in the code, and some/most of the APIs and implementation details may change as we iterate. This is just the first PR.
Test Plan:
We have unit test coverage for all of the APIs, for now this is with toy models:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
```
Reviewed By: raghuramank100
Differential Revision: D26463013
Pulled By: vkuzo
fbshipit-source-id: e454115099ad18e4037d3c54986951cdffcab367
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51669
Adds the basic functionality for the three Numeric Suite core APIs to work on FX models:
1. comparing weights
2. comparing activations, with same input fed to both models
3. comparing activations, with nodes of A shadowing nodes of B
Note: there are a lot of TODOs in the code, and some/most of the APIs and implementation details may change as we iterate. This is just the first PR.
Test Plan:
We have unit test coverage for all of the APIs, for now this is with toy models:
```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
```
Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D26403094
fbshipit-source-id: 9752331d4ae0105346d3da309b13c895b593b450
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51588
Early version of utility to match nodes between graph A and graph B, for Numerical Suite for FX graph mode quantization.
The main goal of this utility is to reliably match the nodes of graph A to the nodes of graph B, and throw an easy to read error message. This will be used in future PRs to create the APIs for matching activations. It also could potentially be used to match weights.
Test Plan:
For now, we have bare bones test coverage on some toy models, and a single torchvision model.
```
python test/test_quantization.py TestFXGraphMatcher
```
Future PRs will add more testing.
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D26403093
fbshipit-source-id: 60e318d51e6fefe65265488c4967629d946048ef
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49671
- Introduces the `torch.nn.quantizable` namespace
- Adds the `torch.nn.quantizable.LSTM` module
The point of the `quantizable` namespace is to segregate the purely quantized modules with the modules that could be quantized through a normal quantization flow, but are not using the quantized kernels explicitly.
That means the quantizable modules are functionally and numerically equivalent to the FP ones and can be used instead of the FP ones without any loss.
The main difference between the `torch.nn.LSTM` and the `torch.nn.quantizable.LSTM` is that the former one does not support observation for the linear layers, because all the computation is internal to the `aten` namespace.
The `torch.nn.quantizable.LSTM`, however, uses explicit linear layers that can be observed for further quantization.
Test Plan: Imported from OSS
Differential Revision: D25663870
Reviewed By: vkuzo
Pulled By: z-a-f
fbshipit-source-id: 70ff5463bd759b9a7922571a5712d3409dfdfa06
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47085
Both in train and eval mode
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D24632457
fbshipit-source-id: 486aee4e073fb87e9da46a344e8dc77e848a60cf
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46657
This is used to simulate fake quantize operation for ops with fixed quantization parameters
e.g. hardsigmoid
Test Plan:
Imported from OSS
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D24451406
fbshipit-source-id: 26cc140c00f12bdec9a8f9dc880f4c425f4d4074
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45538
This is used to simulate fake quantize operation for ops with fixed quantization parameters
e.g. hardsigmoid
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D24004795
fbshipit-source-id: fc4797f80842daacd3b3584c5b72035774634edd