Summary: A cast to int was added in
https://github.com/pytorch/pytorch/pull/45630 to make mypy not complain.
However this leads to unexpected behavior where the histogram doesn't
actually capture the full range of activation values.
note1: the test_histogram_observer_against_reference test was secretly
broken, on master. The random parameters that normally get run apparently don't cause a test failure but if you make a loop repeatedly run the test, it would
eventually fail. This was due to in some cases
sum(<tensor>)!=torch.sum(<tensor>).item(). I was not able to reproduce
this with a toy example but running this test in a loop and editing
either observer to print the calculation for 'total' would break the
test and show different behaviors. Fixing this test was necessary to
land this PR since the changing histogram bounds changed things enough
that this test would error.
note2: updating histogram observer breaks some BC tests unless I regenerate the
model using the HistogramObserver from this PR
Test Plan: python test/test_quantization.py TestHistogramObserver.test_histogram_observer_correct_numel
python test/test_quantization -k histogram
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90355
Approved by: https://github.com/vkuzo
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74204
Minor follow-up to
https://github.com/pytorch/pytorch/pull/73863 that re-enables a
serialization test.
Test Plan:
python test/test_quantization.py
TestSerialization.test_linear_relu_package_quantization_transforms
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D34880378
fbshipit-source-id: f873f63e46cfcd936d7bdffb15c8f2d29e27b3c0
(cherry picked from commit 6a11a3b43ea130097a465304bf386e19992de03a)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65538
Adds a test which verifies that `prepare_fx` and `convert_fx` work
on models created by `torch.package` in the past. In detail:
1. (one time) create a model and save it with torch.package. Also save input,
expected output, and names of quantization related get_attrs added by
our passes.
2. (every time) load the model from (1), and verify that expected output
matches current output, and that get_attr targets did not change.
Test Plan:
```
python test/test_quantization.py TestSerialization.test_linear_relu_package_quantization_transforms
```
Imported from OSS
Reviewed By: supriyar
Differential Revision: D31512939
fbshipit-source-id: 718ad5fb66e09b6b31796ebe0dc698186e9a659f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63043
In version 1 we use the fused module/operator during QAT. Making this the default for all QAT runs going forward.
Older models saved after prepare_qat_fx can still load their state_dict into a model prepared using version 1.
The state_dict will still have the same attribute for the observer/fake_quant modules.
There may be some numerics difference between the old observer code in observer.py and the new fused module that was
re-written in C++/CUDA to perform observe + fake_quantize.
This PR also updates the test to check for the new module instead of the default FakeQuantize module.
Note: there are also some changes to make the operator work for multi-dim per-channel quantization + updated the test for that.
Test Plan:
python test/test_quantization.py TestSerialization.test_default_qat_qconfig
Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D30232222
fbshipit-source-id: f3553a1926ab7c663bbeed6d574e30a7e90dfb5b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62345
This PR updates the attribute names from min_vals to min_val. the motivation for this is to keep the attribute name consistent with per-tensor observers so that dependencies (like FusedMovingAvgObsFakeQuantize) don't need to differentiate between the two observer types to access the attributes.
It also adds some BC tests to make sure that observers saved earlier with min_vals/max_vals can be loaded depending on the state_dict version.
Note: Scriptability of the observers isn't fully supported yet, so we aren't testing for that in this PR.
Test Plan:
python test/test_quantization.py TestSerialization
Imported from OSS
Reviewed By: HDCharles
Differential Revision: D30003700
fbshipit-source-id: 20e673f1bb15e2b209551b6b9d5f8f3be3f85c0a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60241
We're going to make a forward-incompatible change to this serialization
format soon, so I'm taking the opportunity to do a little cleanup.
- Use int for version. This was apparently not possible when V2
was introduced, but it works fine now as long as we use int64_t.
(Note that the 64-bits are only used in memory. The serializer will
use 1 byte for small non-negative ints.)
- Remove the "packed params" tensor and replace it with a list of ints.
- Replace the "transpose" field with "flags" to allow more binary flags
to be packed in.
- Unify required and optional tensors. I just made them all optional
and added an explicit assertion for the one we require.
A bit of a hack: I added an always-absent tensor to the front of the
tensor list. Without this, when passing unpacked params from Python to
the ONNX JIT pass, they type would be inferred to `List[Tensor]` if all
tensors were present, making it impossible to cast to
`std::vector<c10::optional<at:Tensor>>` without jumping through hoops.
The plan is to ship this, along with another diff that adds a flag to
indicate numerical requirements, wait a few weeks for an FC grace
period, then flip the serialization version.
Test Plan: CI. BC tests.
Reviewed By: vkuzo, dhruvbird
Differential Revision: D29349782
Pulled By: dreiss
fbshipit-source-id: cfef5d006e940ac1b8e09dc5b4c5ecf906de8716
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43086
This PR changes the format of `ConvPackedParam` in a nearly backwards-compatible way:
* a new format is introduced which has more flexibility and a lower on-disk size
* custom pickle functions are added to `ConvPackedParams` which know how to load the old format
* the custom pickle functions are **not** BC because the output type of `__getstate__` has changed. We expect this to be acceptable as no user flows are actually broken (loading a v1 model with v2 code works), which is why we whitelist the failure.
Test plan (TODO finalize):
```
// adhoc testing of saving v1 and loading in v2: https://gist.github.com/vkuzo/f3616c5de1b3109cb2a1f504feed69be
// test that loading models with v1 conv params format works and leads to the same numerics
python test/test_quantization.py TestSerialization.test_conv2d_graph
python test/test_quantization.py TestSerialization.test_conv2d_nobias_graph
// test that saving and loading models with v2 conv params format works and leads to same numerics
python test/test_quantization.py TestSerialization.test_conv2d_graph_v2
python test/test_quantization.py TestSerialization.test_conv2d_nobias_graph_v2
// TODO before land:
// test numerics for a real model
// test legacy ONNX path
```
Note: this is a newer copy of https://github.com/pytorch/pytorch/pull/40003
Test Plan: Imported from OSS
Reviewed By: dreiss
Differential Revision: D23347832
Pulled By: vkuzo
fbshipit-source-id: 06bbe4666421ebad25dc54004c3b49a481d3cc92
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43524
1. adds support for testing BC on data format and numerics for graph mode
quantized modules
2. using the above, adds coverage for quantized conv2d on graph mode
Test Plan:
```
python test/test_quantization.py TestSerialization.test_conv2d_nobias
python test/test_quantization.py TestSerialization.test_conv2d_graph
python test/test_quantization.py TestSerialization.test_conv2d_nobias_graph
```
Imported from OSS
Reviewed By: supriyar
Differential Revision: D23335222
fbshipit-source-id: 0c9e93a940bbf6c676c2576eb62fcc725247588b
Summary:
Previously dynamic LSTM modules weren't able to save/load from state_dict since PackedParameter used in RNNs isn't serializable from python
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39105
Test Plan: python test/test_quantization.py TestSerialization
Reviewed By: jerryzh168
Differential Revision: D21752256
Pulled By: supriyar
fbshipit-source-id: ef82cf21ce21a3a1304d147ed0da538c639f952d
Summary:
re-created the same PR: https://github.com/pytorch/pytorch/pull/36639
because ghimport does not support importing binary files right now
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36842
Test Plan: python test/quantization/test_backward_compatibility.py
Differential Revision: D21100689
Pulled By: jerryzh168
fbshipit-source-id: 625a0f9da98138c9c2891b9d99fc45d85fa27cca
Summary:
re-created the same PR: https://github.com/pytorch/pytorch/pull/36639
because ghimport does not support importing binary files right now
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36771
Test Plan: python test/quantization/test_backward_compatibility.py
Differential Revision: D21080503
Pulled By: jerryzh168
fbshipit-source-id: 1dca08208bccead60bba03e5fb5d39e1a1d7c20d