pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
asl3	13ad4739a6	[quant] Implement PTQ for APoT FakeQuant (#81040 ) ### Summary: This PR implements PTQ for APoT FakeQuant. It runs models (Resnet-18 pre-trained model, ImageNet dataset) to compare accuracy metrics for different qconfig settings of uniform vs. APoT quantized activation and weight. According to the collected accuracy stats, model #2 (uniform activation and APoT weight) appears to have a slight improvement in accuracy compared to model #1 (uniform activation and uniform weight) for 8-bit and significant improvement for 4-bit (see "Accuracy Stats" section below). ### Test Plan: Run models with: `python test/quantization/core/experimental/fx_graph_mode_apot.py` ### Accuracy Stats: 8-bit (Uniform int8, APoT b = 8 k = 2) Model #1: Uniform activation, uniform weight (FX Graph Mode quantized) Evaluation accuracy on test dataset: 64.43% (Top-1), 85.62% (Top-5) Model #2: Uniform activation, APoT weight (FX Graph Mode quantized) Evaluation accuracy on test dataset: 64.51% (Top-1), 85.78% (Top-5) Model #3: APoT activation, APoT weight (FX Graph Mode quantized) Evaluation accuracy on test dataset: 64.32% (Top-1), 85.78% (Top-5) 4-bit (Uniform int4, APoT b = 4 k = 2) Model #1: Uniform activation, uniform weight (FX Graph Mode quantized) Evaluation accuracy on test dataset: 45.63% (Top-1), 71.96% (Top-5) Model #2: Uniform activation, APoT weight (FX Graph Mode quantized) Evaluation accuracy on test dataset: 64.24% (Top-1), 85.56% (Top-5) Model #3: APoT activation, APoT weight (FX Graph Mode quantized) Evaluation accuracy on test dataset: 45.40% (Top-1), 76.21% (Top-5) Full Precision model (FX Graph Mode quantized) Evaluation accuracy on test dataset: 69.76% (Top-1), 89.08% (Top-5) Eager mode quantized model Evaluation accuracy on test dataset: 69.49% (Top-1), 88.90% (Top-5) Pull Request resolved: https://github.com/pytorch/pytorch/pull/81040 Approved by: https://github.com/jerryzh168	2022-07-28 07:21:31 +00:00
asl3	a01fb5392f	Modify APoT dequantize method (#82126 ) ### Summary Modify APoT dequantize method to correctly add dequantized values to result numpy array and retain original tensor dimensions ### Test Plan Run unit tests with: `python test/quantization/core/experimental/test_quantizer.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/82126 Approved by: https://github.com/HDCharles	2022-07-26 20:26:51 +00:00
asl3	368018530e	[quant] Implement forward and backward autograd functions for fake quantize (#81438 ) ### Summary: This PR implements custom autograd functions for forward and backward to be used in APoT fake quantization. The implementation follows this doc about custom autograd functions: https://pytorch.org/tutorials/beginner/examples_autograd/polynomial_custom_function.html ### Test Plan: Run tests with: `python test/quantization/core/experimental/test_fake_quantize.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/81438 Approved by: https://github.com/jerryzh168	2022-07-19 02:15:30 +00:00
asl3	5b493ba18b	[quant] Refactor quantize clamping into float_to_apot util function (#80885 ) ### Summary: This PR moves the clamping functionality from `quantize` to `float_to_apot` util function to align with the uniform quantize workflow in the codebase. ### Test Plan: Run unit tests with: python pytorch/test/quantization/core/experimental/test_quantizer.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/80885 Approved by: https://github.com/dzdang	2022-07-05 19:28:37 +00:00
asl3	2727d88569	[quant] Modify APoT global methods to align with uniform API (#80364 ) ### Summary: This PR updates the APoT global API method signatures and parameters for `dequantize_APoT` and `calculate_qparams` to align with their uniform counterparts in the codebase. ### Test Plan: Run unit tests with: `python pytorch/test/quantization/core/experimental/test_nonuniform_observer.py` `python pytorch/test/quantization/core/experimental/test_quantizer.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/80364 Approved by: https://github.com/jerryzh168	2022-06-27 22:48:09 +00:00
asl3	777c12f2df	[quant] Modify APoT nonuniform quantization workflow (#80075 ) ### Summary: This PR updates the design of APoT Observer, Quantizer, and Tensor to be more consistent with their uniform counterparts in the PyTorch framework. APoT Observer now calculates alpha as the max between the absolute values of the max and min values in the input tensor. APoT Quantizer is modified so its instance methods quantize_APoT and dequantize_APoT are called by their global method counterparts. APoT Tensor is modified to account for the new method definition of the `quantize_APoT` from APoT Quantizer. ### Test Plan: Run APoT Observer class unit tests with: `python pytorch/test/quantization/core/experimental/test_nonuniform_observer.py` Run APoT Quantize class unit tests with: `python pytorch/test/quantization/core/experimental/test_quantizer.py` Run APoT Tensor class unit tests with: `python pytorch/test/quantization/core/experimental/test_quantized_tensor.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/80075 Approved by: https://github.com/jerryzh168	2022-06-27 14:54:06 +00:00
asl3	0b349f7e69	[quant] Dequantize apot tensor Pull Request resolved: https://github.com/pytorch/pytorch/pull/79530 Approved by: https://github.com/dzdang, https://github.com/jerryzh168	2022-06-22 05:15:06 +00:00
asl3	d6ec8398a9	[quant] Implement quantize APoT method Pull Request resolved: https://github.com/pytorch/pytorch/pull/79499 Approved by: https://github.com/dzdang, https://github.com/jerryzh168	2022-06-22 05:15:06 +00:00
asl3	f89e640810	[quant] Add quantizer class skeleton Pull Request resolved: https://github.com/pytorch/pytorch/pull/79936 Approved by: https://github.com/jerryzh168	2022-06-22 05:11:15 +00:00

9 Commits