pytorch/torch/_inductor
Jerry Zhang 1b51d29b66 [quant][pt2e] Enable constant folding for quantize ops (#109343)
Summary:
This PR added constant folding for quantize ops so that instead of storing fp32 weight in the
quantized model, we'll get int8/int16 etc. weight

Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_fold_quantize

also will verify in executorch later

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D49399210](https://our.internmc.facebook.com/intern/diff/D49399210)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109343
Approved by: https://github.com/kimishpatel, https://github.com/jgong5
2023-09-27 06:04:45 +00:00
..
codegen [inductor] support _scaled_dot_product_flash_attention fallback (#110085) 2023-09-27 00:09:56 +00:00
fx_passes fix sfdp patern 13 accuracy issue (#110001) 2023-09-26 15:23:45 +00:00
kernel [Inductor CUTLASS backend] Step 5: Gemm CUTLASS templates (#108015) 2023-09-12 17:44:38 +00:00
__init__.py Enable mypy checking in torch/_inductor/__init__.py (#108336) 2023-09-06 17:14:54 +00:00
autotune_process.py [inductor] Set CUDA_VISIBLE_DEVICES for multi-device subprocess autotuning (#109500) 2023-09-21 14:29:30 +00:00
bounds.py [inductor] Enable mypy checking for torch/_inductor/bounds.py (#109271) 2023-09-15 17:47:24 +00:00
codecache.py [aotinductor] Relax the CUDAGuard device index check (#110030) 2023-09-26 16:23:23 +00:00
compile_fx.py [AOTInductor] Skip pre_grad_passes for exported graph. (#109246) 2023-09-14 13:30:12 +00:00
config.py [RFC] Allow "spawn" start method for torchinductor workers. (#108850) 2023-09-25 21:30:17 +00:00
constant_folding.py [quant][pt2e] Enable constant folding for quantize ops (#109343) 2023-09-27 06:04:45 +00:00
coordinate_descent_tuner.py Move has_triton to top level triton utils so that dynamo can also access (#109832) 2023-09-22 19:33:41 +00:00
cudagraph_trees.py Fix 0-sized views of tensors in cudagraphs (#109055) 2023-09-12 01:24:43 +00:00
debug.py [inductor] visualize fused ops in svg graph (#107752) 2023-09-21 08:03:05 +00:00
decomposition.py [core IR] Add a core decomposition for aten.all (#110093) 2023-09-27 01:31:41 +00:00
dependencies.py Enable typechecking for _inductor/virtualized.py (#108916) 2023-09-13 13:04:51 +00:00
exc.py Enable typing for _inductor/exc.py (#109176) 2023-09-15 12:36:59 +00:00
freezing.py Fix spelling / capitalization in freezing.py error message (#109347) 2023-09-18 18:12:20 +00:00
fx_utils.py Enable typechecking for _inductor/fx_utils.py (#109415) 2023-09-18 18:12:23 +00:00
graph.py [RFC] Add debug log as we lower each FX node (#109602) 2023-09-22 03:10:22 +00:00
hooks.py Track exact origin_node on best effort basis (#100110) 2023-04-28 04:15:27 +00:00
index_propagation.py Basic fp8 support in Inductor (#109168) 2023-09-23 04:41:41 +00:00
inductor_prims.py [inductor] Lower masked_scatter on CUDA (#108803) 2023-09-15 16:36:06 +00:00
ir.py [inductor] support _scaled_dot_product_flash_attention fallback (#110085) 2023-09-27 00:09:56 +00:00
lowering.py Basic fp8 support in Inductor (#109168) 2023-09-23 04:41:41 +00:00
metrics.py Estimate Scheduler node runtimes (#106426) 2023-08-17 17:23:30 +00:00
optimize_indexing.py Enable mypy check in torch/_inductor/optimize_indexing.py (#107943) 2023-08-28 17:08:13 +00:00
pattern_matcher.py Back out "[pytorch][PR] [Inductor] Extend Pattern Matcher to Match Equivalent Function Invocation" (#109931) 2023-09-23 05:58:08 +00:00
quantized_lowerings.py [Quant][Inductor] Enable the lowering of quantized maxpool2d (#105906) 2023-08-26 08:36:47 +00:00
scheduler.py Move has_triton to top level triton utils so that dynamo can also access (#109832) 2023-09-22 19:33:41 +00:00
select_algorithm.py [Inductor] Generalize inductor triton backend device agnostic (#109486) 2023-09-24 07:49:20 +00:00
sizevars.py AOTInductor dynamic shape (#109012) 2023-09-14 08:00:30 +00:00
test_operators.py
triton_helpers.py [inductor] Add ir.WelfordReduction with multiple outputs (#104725) 2023-08-18 08:18:01 +00:00
triton_heuristics.py [Inductor] Generalize inductor triton backend device agnostic (#109486) 2023-09-24 07:49:20 +00:00
utils.py [aotinductor] Rename aot_runtime to aoti_runtime (#110007) 2023-09-26 00:46:54 +00:00
virtualized.py Enable typechecking for _inductor/virtualized.py (#108916) 2023-09-13 13:04:51 +00:00
wrapper_benchmark.py [inductor] Add CPU-side profiler event names for templates and foreach kernels (#108449) 2023-09-09 02:11:13 +00:00