mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-07 12:21:27 +01:00
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54254 In fp16 emulation, we now have patterns such as ``` ... -> dequantize -> linear -> relu -> to(torch.float16) -> ... ``` This PR adds support for * specifying a subgraph's "base_op_node", which is the node with the op which should be matched to related nodes. In the example above, "base_op_node" would be the linear node, and it would be the second node in the matched pattern. * matching these fusion patterns and properly setting "base_op_node" based on pattern and index * using "base_op_node" instead of "start_node" throughout the NS codebase wherever the intent is to match subgraphs or create names for subgraphs. At the end of this PR, matching unshadowed activations with an example fp16 emulation pattern works e2e. I'm saving the following work for future PRs (soon), mostly to keep PR size manageable: * adding weight matching (will require some changes to function which extracts weights) * adding shadowed activation matching (will require some changes to shadow copying) * adding input logging for these patterns (will likely require some changes as well) Test Plan: ``` python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_linear_fp16 ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D27158199 fbshipit-source-id: 49fc445395452fda62e3c7a243544190f9af691c |
||
|---|---|---|
| .. | ||
| fx | ||
| ns | ||
| __init__.py | ||
| _correct_bias.py | ||
| _equalize.py | ||
| _learnable_fake_quantize.py | ||
| _numeric_suite_fx.py | ||
| _numeric_suite.py | ||
| fake_quantize.py | ||
| fuse_modules.py | ||
| fuser_method_mappings.py | ||
| observer.py | ||
| qconfig.py | ||
| quant_type.py | ||
| quantization_mappings.py | ||
| quantize_fx.py | ||
| quantize_jit.py | ||
| quantize.py | ||
| stubs.py | ||
| utils.py | ||