Commit Graph

6 Commits

Author SHA1 Message Date
Xiang Gao
20ac736200 Remove py2 compatible future imports (#44735)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44735

Reviewed By: mruberry

Differential Revision: D23731306

Pulled By: ezyang

fbshipit-source-id: 0ba009a99e475ddbe22981be8ac636f8a1c8b02f
2020-09-16 12:55:57 -07:00
Paul Shao
382781221d Extending Learnable Fake Quantize module to support gradient scaling and factory (partial) construction (#41969)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41969

In this diff, the `_LearnableFakeQuantize` module is extended to provide support for gradient scaling where the gradients for both scale and zero point are multiplied by a constant `g` (in some cases, can help with quicker convergence). In addition, it is also augmented to provide a factory method via `_with_args` such that a partial constructor of the module can be built.

Test Plan:
For correctness of the fake quantizer operators, on a devvm, enter the following command:
```
buck test //caffe2/torch:quantization -- learnable_py_module
```

Reviewed By: z-a-f

Differential Revision: D22715629

fbshipit-source-id: ff8e5764f81ca7264bf9333789f57e0b0cec7a72
2020-07-29 10:22:26 -07:00
Paul Shao
5a6d88d503 Updates to Scale and Zero Point Gradient Calculation (#42034)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42034

In this diff, scale and zero point gradient calculations are updated to correctly reflect the actual backpropagation equation (instead of `dScale * dX`, the near-final output should be `dScale * dY`; the same applies to zero point).

Test Plan:
To execute the unit tests for all affected learnable fake quantize modules and kernels, on a devvm, execute the following command:

`buck test //caffe2/test:quantization -- learnable`

To enable the `cuda` tests, execute the following command:

`buck test mode/dev-nosan //caffe2/test:quantization -- learnable`

Reviewed By: jerryzh168

Differential Revision: D22735668

fbshipit-source-id: 45c1e0fd38cbb2d8d5e60be4711e1e989e9743b4
2020-07-27 11:18:49 -07:00
Paul Shao
c261a894d1 Updates to Python Module for Calculation of dX and Addition of Unit Tests (#42033)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42033

In this diff, the Python `_LearnableFakeQuantize` module is updated where the gradient with respect to the input `x` is actually computed instead of passed through. Argument naming is also updated for better clarity; and unit tests on the `PerTensor` and `PerChannel` operators are added for asserting correctness.

Test Plan:
On a devvm, execute the command:

`buck test //caffe2/test:quantization -- learnable_py_module`

To include `cuda` tests as well, run:

`buck test mode/dev-nosan //caffe2/test:quantization -- learnable_py_module`

Reviewed By: jerryzh168

Differential Revision: D22735580

fbshipit-source-id: 66bea7e9f8cb6422936e653500f917aa597c86de
2020-07-27 11:18:47 -07:00
Paul Shao
9e0c746b15 Augmenting Concrete Observer Constructors to Support Dynamic Quantization Range; Modifying Utility Functions in _LearnableFakeQuantize Module for Better Logging and Baseline Construction. (#41815)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41815

**All are minor changes to enable better simulations.**

The constructors of MinMaxObserver, MovingAverageMinMaxObserver, PerChannelMinMaxObserver, and MovingAveragePerChannelMinMaxObserver are augmented so they can utilize the dynamic quantization range support in the _ObserverBase class.

In addition, minor adjustments are made to the enable_static_observation function that allow observer to update parameters but do not fake quantize on the output (for constructing baseline).

Test Plan:
To ensure this modification is still backward compatible with past usages, numerics are verified by running the quantization unit test suite, which contains various observer tests. The following command executes the test suite, which also verifies the observer numerics:
```
buck test //caffe2/test:quantization -- observer
```

Reviewed By: z-a-f

Differential Revision: D22649128

fbshipit-source-id: 32393b706f9b69579dc2f644fb4859924d1f3773
2020-07-21 17:59:40 -07:00
Paul Shao
5c50cb567c Generalized Learnable Fake Quantizer Module (#41535)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41535

A generalized fake quantization module is built to support lower-bit fake quantization with back propagation on the scale and zero point. The module supports both per tensor and per channel fake quantization.

Test Plan:
Please see diff D22337313 for a related experiment performed on the fake quantizer module.

The `_LearnableFakeQuantize` module supports the following use cases:
- Per Tensor Fake Quantization or Per Channel Fake Quantization
- Static Estimation from Observers or Quantization Parameter Learning through Back Propagation

By default, the module assumes per tensor affine fake quantization. To switch to per channel, during initialization, declare `channel_size` with the appropriate length. To toggle between utilizing static estimation and parameter learning with back propagation, you can invoke the call `enable_param_learning` or `enable_static_estimate`. For more information on the flags that support these operations, please see the doc string of the `_LearnableFakeQuantize` module.

The `_LearnableFakeQuantizer` module relies on 2 operators for its forward and backward paths: `_LearnableFakeQuantizePerTensorOp` and `_LearnableFakeQuantizePerChannelOp`. The backpropagation routine is developed based on the following literature:
- Learned Step Size Quantization: https://openreview.net/pdf?id=rkgO66VKDS
- Trained Quantization Thresholds: https://arxiv.org/pdf/1903.08066.pdf

Reviewed By: z-a-f

Differential Revision: D22573645

fbshipit-source-id: cfd9ece8a959ae31c00d9beb1acf9dfed71a7ea1
2020-07-20 18:24:21 -07:00