Commit Graph

23 Commits

Author SHA1 Message Date
HDCharles
428cbd7513 [ao] fixing multihead attention convert size (#110407)
Summary: after converting nn.multihead attention we weren't deleting the
old in_proj_weight and in_proj_bias despite not (really) using them.

Test Plan: python test/test_quantization.py -k
"test_custom_module_multi_head_attention"

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110407
Approved by: https://github.com/jerryzh168
2023-10-03 08:49:12 +00:00
Aaron Gokaslan
660e8060ad [BE]: Update ruff to 0.285 (#107519)
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.

I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
2023-08-22 23:16:38 +00:00
PyTorch MergeBot
d59a6864fb Revert "[BE]: Update ruff to 0.285 (#107519)"
This reverts commit 88ab3e4322.

Reverted https://github.com/pytorch/pytorch/pull/107519 on behalf of https://github.com/ZainRizvi due to Sorry, but this PR breaks internal tests. @ezyang, can you please hep them get unblocked? It seems like one of the strings was prob accidentally modified ([comment](https://github.com/pytorch/pytorch/pull/107519#issuecomment-1688833480))
2023-08-22 19:53:32 +00:00
Aaron Gokaslan
88ab3e4322 [BE]: Update ruff to 0.285 (#107519)
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.

I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
2023-08-20 01:36:18 +00:00
Justin Chu
c0d8a4af0a [BE] Enable ruff's UP rules and autoformat ao/ (#105430)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105430
Approved by: https://github.com/albanD, https://github.com/malfet
2023-07-19 13:44:37 +00:00
Nikita Shulga
5837e95d30 [Reland] Update mypy to 1.4.1 (#105227)
This PR re-lands
- [Typing] Fix PEP 484 Violation (#105022)
- Update mypy to 1.4.1 (#91983)

That were reverted due to the conflict with internal source repo.

Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional)
Plus few real fixes:
  - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi`
  - Add missing return statement to `torch._export. deserialize_graph`
  - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights`
  - Add assert it `torch/optim/optimizer.py` that Optional list is not None
TODO (in followup PR):
  - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py`

Unrelated, to bypass CI failures due to the gcc9 dependency update in Ubuntu-18.04:
- Add hack to squash older libstdc++ from conda environment in favor one from OS to `.ci/docker/install_conda.sh`
- Update bazel cuda builds to focal, as with libstdc++-6.0.32 bazel builds loose the ability to catch exceptions (probably because they link with cupti statically, but I could not found where it is done)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227
Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007
2023-07-15 20:30:20 +00:00
PyTorch MergeBot
15fd1ea118 Revert "[Reland] Update mypy to 1.4.1 (#105227)"
This reverts commit c9c4f8efc3.

Reverted https://github.com/pytorch/pytorch/pull/105227 on behalf of https://github.com/atalman due to trying to mitigate ci sev #105248 ([comment](https://github.com/pytorch/pytorch/pull/105227#issuecomment-1636510935))
2023-07-14 22:28:35 +00:00
Nikita Shulga
c9c4f8efc3 [Reland] Update mypy to 1.4.1 (#105227)
This PR re-lands
- [Typing] Fix PEP 484 Violation (#105022)
- Update mypy to 1.4.1 (#91983)

That were reverted due to the conflict with internal source repo.

Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional)
Plus few real fixes:
  - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi`
  - Add missing return statement to `torch._export. deserialize_graph`
  - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights`
  - Add assert it `torch/optim/optimizer.py` that Optional list is not None
TODO (in followup PR):
  - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227
Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007
2023-07-14 20:45:12 +00:00
PyTorch MergeBot
b4d91b1c5b Revert "[Typing] Fix PEP 484 Violation (#105022)"
This reverts commit 4148b7bada.

Reverted https://github.com/pytorch/pytorch/pull/105022 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/105022#issuecomment-1635967734))
2023-07-14 14:45:09 +00:00
Nikita Shulga
4148b7bada [Typing] Fix PEP 484 Violation (#105022)
Not sure, how it worked before, but if arguments must be annotated is optional if they are defaulted to None

Towards enabling mypy-1.4.1 in lintrunner

<!--
copilot:poem
-->
### <samp>🤖 Generated by Copilot at 5e1b9f4</samp>

> _We annotate the arguments of doom_
> _To show the `None` values of gloom_
> _We improve the type checking and readability_
> _With `Optional` annotations of metal-ity_

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105022
Approved by: https://github.com/izaitsevfb, https://github.com/huydhn, https://github.com/Skylion007
2023-07-12 10:20:48 +00:00
Nitin Jain
556bb691fd [AO]Fix observed LSTM layer setup individually observed LSTM (#101299)
Summary: We have found that `_get_lstm_with_individually_observed_parts()` is missing setup step which sets up the LSTM layer state initializing weights and biases of this layer. This diff fixes the observed numerical discrepancy seen by CTRL team in using the above API.

Test Plan: N3358643

Differential Revision: D45821681

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101299
Approved by: https://github.com/andrewor14
2023-05-18 19:15:01 +00:00
Nitin Jain
40df3b41aa [AO] Update qLSTM implementation to remove unsupported backend ops (#96436)
Summary:
The reference quantized LSTM implementation uses unbind and inplace squeeze both of which are not supported when building BoltNN's Espresso IR graph.

This change adjusts the reference AO Quantizable LSTM implementation without affecting numerically while enabling removal of unsupported ops in BoltNN.

Modifications & Adjustments
1. Unbind ops appear when unstacking tensor in loop. Replaced this by getting first dim from shape and looping using ranged index.
2. Removed unbind ops call where the pattern is
`[x = t.unbind(0) -> x[i]]` can be just replaced by `t[i]` as creating a tuple from unbind is unnecessary.
3. inplace squeeze `squeeze_` uses which were not required has been replaced by `squeeze`.

See notebook N3235193 which was used for testing quantization flow and inspect the torch scripted quantized model for the set of ops used(See last cell).

Test Plan: N3235193

Reviewed By: andrewor14

Differential Revision: D43935389

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96436
Approved by: https://github.com/andrewor14
2023-03-14 17:58:34 +00:00
andrewor14
faa4cb29b2 [Quant][fx] Create new FX-based LSTM reference module (#96343)
Summary: The previous LSTM reference module implementation did
not handle dtypes other than quint8 correctly. This is because
the internal LSTM custom module quantization used eager mode,
which did not insert the q-dq ops properly. E.g., we want the
following reference quantized model:

```
[dq -> linear1_fp32 -> q_to_qint32] -> dq -> q_to_quint8 ->
  [dq - linear2_fp32 -> q_to_quint8] -> dq -> ...
```

This requires two sets of `q - dq` pairs between two adjacent
ops that have different dtypes (linear1 and linear2). However,
these `q - dq` pairs were not inserted in the old flow, because
eager mode required users to insert Quant/DeQuantStubs manually.

This commit changes the internal LSTM custom module quantization
to use FX graph mode quantization, which automatically inserts
the `q - dq` ops that convert the dtypes between adjacent ops
correctly. However, using FX graph mode quantization here comes
with its own set of challenges that required some hacks to get
the end-to-end flow to work. These hacks are detailed in the
comments in the util functions.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_static_lstm_with_custom_fixed_qparams

This commit also updates the corresponding test to verify the
dtypes as well as the qparams in the reference quantized graph.
This test case should serve as an example for users to set up
their own LSTM reference module flows.

Reviewers: vkuzo, supriyar, jcaip

Subscribers: vkuzo, supriyar, jcaip
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96343
Approved by: https://github.com/vkuzo
2023-03-09 23:23:48 +00:00
Xuehai Pan
5b1cedacde [BE] [2/3] Rewrite super() calls in functorch and torch (#94588)
Rewrite Python built-in class `super()` calls. Only non-semantic changes should be applied.

- #94587
- #94588
- #94592

Also, methods with only a `super()` call are removed:

```diff
class MyModule(nn.Module):
-   def __init__(self):
-       super().__init__()
-
    def forward(self, ...):
        ...
```

Some cases that change the semantics should be kept unchanged. E.g.:

f152a79be9/caffe2/python/net_printer.py (L184-L190)

f152a79be9/test/test_jit_fuser_te.py (L2628-L2635)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94588
Approved by: https://github.com/ezyang, https://github.com/albanD
2023-02-10 21:16:33 +00:00
Vasiliy Kuznetsov
f15ab8a7f2 AO migration: replace torch internal callsites (#94170)
Summary:

Do the following renames:
`torch.quantization` -> `torch.ao.quantization`
`torch.nn.quantized` -> `torch.ao.nn.quantized`
`torch.nn.quantizable` -> `torch.ao.nn.quantizable`
`torch.nn.qat` -> `torch.ao.nn.qat`
`torch.nn.intrinsic` -> `torch.ao.nn.intrinsic`

And then, do
`torch.ao.nn.quantized._reference` -> `torch.ao.nn.quantized.reference` to clean up the aftermath of https://github.com/pytorch/pytorch/pull/84974

Then, manually update `test/test_module_init.py` to fix hanging whitespace due to the replace.

Run this script to do the replacements: https://gist.github.com/vkuzo/7f7afebf8c31b9ba48306223e68a1c82

This is for https://github.com/pytorch/pytorch/issues/81667

Test plan: CI
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94170
Approved by: https://github.com/jerryzh168
2023-02-07 02:32:23 +00:00
Michael Gschwind
7265f60ad0 Regularize mask handling for attn_mask and key_padding_mask (#92733)
Summary:
Regularize mask handling for attn_mask and key_padding_mask
* Update documentation to remove reference to byte masks (which were deprecated long ago)
* Introduce check and warn about deprecation if attn_mask and key_padding_mask types mismatch
* Convert all masks to float before combining
* Combine by adding

Test Plan: sandcastle & github CI

Differential Revision: D42653215

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92733
Approved by: https://github.com/ngimel, https://github.com/drisspg
2023-01-24 14:12:05 +00:00
Xia, Weiwen
61a7618f3c [Quant][Eager] Copy MHA's batch_first attribute in prepare() (#91680)
**Summary**
Fixes #91571
MHA's batch_first attribute is not copied after `torch.quantization.prepare()`. Now we copy MHA's batch_first attribute in torch/ao/nn/quantizable/modules/activation.py: `MultiheadAttention.from_float()`.

**Test plan**
python test/test_quantization.py -k test_mha_batch_first_attr_is_copied_in_prepare

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91680
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2023-01-18 10:49:05 +00:00
Michael Gschwind
512ec181ec Introduce causal mask (#90508)
Summary: Introduce causal mask

This PR introduces a causal mask option _causal_mask (as well as causal mask detection if attn_mask is provided), since current custom kernels do not support arbitrary masks.

Test Plan: sandcastle & github ci/cd

Differential Revision: D41723137

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90508
Approved by: https://github.com/albanD
2022-12-16 21:39:42 +00:00
HDCharles
6a866c3ed1 [ao] fixing public v private for torch.ao.nn.X (#87883)
Summary: this mostly consisted of adding __all__ to files without them.
A few functions in X.utils were made private too

Test Plan: python test/test_public_bindings.py

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D40814548](https://our.internmc.facebook.com/intern/diff/D40814548)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87883
Approved by: https://github.com/jcaip, https://github.com/anjali411
2022-12-15 03:03:07 +00:00
andrewor14
19e66fcec2 [Quant] Allow setting fixed qparams for inner LSTM ops (#88456)
Summary: In both eager and FX graph mode quantization,
`torch.ao.nn.quantizable.LSTM` is used as an observed custom module,
which is responsible for inserting its own observers. By default,
the user specifies a single QConfig for the custom module (either
through QConfigMapping or by setting the "qconfig" attribute"),
and all inner ops will [inherit this
QConfig](dc00bb51b8/torch/ao/nn/quantizable/modules/rnn.py (L366-L378))
and use the same observer/fake_quantize constructors.

Today, users who wish to override this behavior must extend
`torch.ao.nn.quantizable.LSTM` and write a lot of custom code
to manually assign the QConfigs to the inner ops. This commit
alleviates this burden on the user by providing a helper function
to assign QConfigs with custom observers. An example use case of
this is providing a reference implementation for a backend kernel
that hardcodes qparams for efficiency.

Example usage:
```
import torch
from torch.ao.quantization import get_default_qconfig_mapping
from torch.ao.quantization.fx.custom_config import (
    PrepareCustomConfig,
    ConvertCustomConfig,
)

class MyModel(torch.nn.Module):
    ...

class UserLSTM(torch.ao.nn.quantizable.LSTM):
    @classmethod
    def from_float(cls, other):
        assert isinstance(other, cls._FLOAT_MODULE)
        linear_output_obs_ctr = FixedQParamsObserver.with_args(
            scale=2 ** -11, zero_point=2 ** 15, dtype=torch.qint32)
        sigmoid_obs_ctr = FixedQParamsObserver.with_args(
            scale=2 ** -16, zero_point=0, dtype=torch.qint32)
        tanh_obs_ctr = FixedQParamsObserver.with_args(
            scale=2 ** -15, zero_point=2 ** 15, dtype=torch.qint32)
        cell_state_obs_ctr = FixedQParamsObserver.with_args(
            scale=2 ** -11, zero_point=0, dtype=torch.qint32)
        hidden_state_obs_ctr = FixedQParamsObserver.with_args(
            scale=2 ** -7, zero_point=2 ** 7, dtype=torch.quint8)
        return torch.ao.quantization.utils._get_lstm_with_individually_observed_parts(
            float_lstm=other,
            linear_output_obs_ctr=linear_output_obs_ctr,
            sigmoid_obs_ctr=sigmoid_obs_ctr,
            tanh_obs_ctr=tanh_obs_ctr,
            cell_state_obs_ctr=cell_state_obs_ctr,
            hidden_state_obs_ctr=hidden_state_obs_ctr,
        )

qconfig_mapping = get_default_qconfig_mapping()
example_inputs = (torch.rand(5, 3, 50), torch.rand(1, 3, 50), torch.randn(1, 3, 50))
prepare_custom_config = PrepareCustomConfig() \
    .set_float_to_observed_mapping(torch.nn.LSTM, UserLSTM)
convert_custom_config = ConvertCustomConfig() \
    .set_observed_to_quantized_mapping(UserLSTM, torch.ao.nn.quantized.LSTM)
model = MyModel()
model = prepare_fx(model, qconfig_mapping, example_inputs, prepare_custom_config=prepare_custom_config)
model(*example_inputs)  # calibrate
model = convert_fx(model, convert_custom_config=convert_custom_config)
model(*example_inputs)
```

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_static_lstm_with_custom_fixed_qparams

Reviewers: jerryzh168, vkuzo

Subscribers: jerryzh168, vkuzo

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88456
Approved by: https://github.com/jerryzh168, https://github.com/vkuzo
2022-11-18 16:27:12 +00:00
zaf
29e83b6599 [quant][ao_migration] torch.nn.quantizabletorch.ao.nn.quantizable. (#78717)
Context: In order to avoid the cluttering of the `torch.nn` namespace
the quantized modules namespace is moved to `torch.ao.nn`.

The list of the `nn.quantized` files that are being migrated:

- [X] `torch.nn.quantized` → `torch.ao.nn.quantized`
    - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional`
    - [X] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules`
    - [X] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic`
    - [X] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference`
- [X] [Current PR] `torch.nn.quantizable` → `torch.ao.nn.quantizable`
- [ ] `torch.nn.qat` → `torch.ao.nn.qat`
    - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules`
    - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic`
- [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic`
    - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules`
    - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat`
    - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized`
        - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules`
        - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic`

Majority of the files are just moved to the new location.
However, specific files need to be double checked:

- `torch/ao/nn/__init__.py` → Changing the imports to lazy.

Differential Revision: [D36861090](https://our.internmc.facebook.com/intern/diff/D36861090/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36861090/)!

Differential Revision: [D36861090](https://our.internmc.facebook.com/intern/diff/D36861090)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78717
Approved by: https://github.com/jerryzh168
2022-08-25 16:50:37 +00:00
PyTorch MergeBot
e9dd4d5adf Revert "[quant][ao_migration] torch.nn.quantizabletorch.ao.nn.quantizable. (#78717)"
This reverts commit e0876feb49.

Reverted https://github.com/pytorch/pytorch/pull/78717 on behalf of https://github.com/janeyx99 due to sorry, reverting so https://github.com/pytorch/pytorch/pull/78713 could be cleanly reverted
2022-08-22 07:26:44 +00:00
zaf
e0876feb49 [quant][ao_migration] torch.nn.quantizabletorch.ao.nn.quantizable. (#78717)
Context: In order to avoid the cluttering of the `torch.nn` namespace
the quantized modules namespace is moved to `torch.ao.nn`.

The list of the `nn.quantized` files that are being migrated:

- [X] `torch.nn.quantized` → `torch.ao.nn.quantized`
    - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional`
    - [X] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules`
    - [X] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic`
    - [X] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference`
- [X] [Current PR] `torch.nn.quantizable` → `torch.ao.nn.quantizable`
- [ ] `torch.nn.qat` → `torch.ao.nn.qat`
    - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules`
    - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic`
- [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic`
    - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules`
    - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat`
    - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized`
        - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules`
        - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic`

Majority of the files are just moved to the new location.
However, specific files need to be double checked:

- None

Differential Revision: [D36861090](https://our.internmc.facebook.com/intern/diff/D36861090/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36861090/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78717
Approved by: https://github.com/jerryzh168
2022-08-22 05:31:48 +00:00