pytorch/torch/_inductor
leslie-fang-intel b18ba9419e [AO][Inductor] Enable WOQ fusion pattern with permute (#135928)
**Summary**
Fix https://github.com/pytorch/pytorch/issues/135831 and https://github.com/pytorch/ao/issues/890. The root cause of the numerical failure was that the customized woq-int8 kernel was not triggered due to changes in the pattern. After re-adding the fusion pattern, the accuracy check now passes. I will open a separate TorchAO PR to enable these unit tests in TorchAO.

**Test Plan**
```
python test/inductor/test_mkldnn_pattern_matcher.py -k test_woq_int8
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135928
Approved by: https://github.com/jgong5, https://github.com/eellison
2024-09-18 00:56:16 +00:00
..
autoheuristic AutoHeuristic: mm ranking heuristic h100 (#133608) 2024-08-16 16:20:38 +00:00
codegen [BE]: Update mypy to 1.11.2 (#133816) 2024-09-16 19:44:11 +00:00
compile_worker Add basic mypy annotations to inductor (#132416) 2024-08-04 18:43:37 +00:00
fx_passes [AO][Inductor] Enable WOQ fusion pattern with permute (#135928) 2024-09-18 00:56:16 +00:00
kernel [BE]: Update mypy to 1.11.2 (#133816) 2024-09-16 19:44:11 +00:00
package [aoti] Add cpp loader (#135374) 2024-09-11 03:00:01 +00:00
runtime [BE]: Update mypy to 1.11.2 (#133816) 2024-09-16 19:44:11 +00:00
__init__.py [aoti] Add cpp loader (#135374) 2024-09-11 03:00:01 +00:00
aoti_eager.py [4/N] Non-Tensor: Support layout, device and dtype for aten operations (#125897) 2024-07-23 17:50:17 +00:00
async_compile.py [inductor] Enable subprocess parallel compile internally with killswitch (#132467) 2024-09-10 19:05:46 +00:00
autotune_process.py Revert "Add Triton CPU as an Inductor backend (#133408)" 2024-09-16 18:33:33 +00:00
bounds.py [inductor] Move LoopBody to its own file (#135257) 2024-09-07 16:29:15 +00:00
codecache.py Refactor FxGraphCache.load into separate functions, so that AOTAutogradCache may access it correctly later (#135491) 2024-09-16 19:48:08 +00:00
comm_analysis.py [BE][Easy][16/19] enforce style for empty lines in import segments in torch/_i*/ (#129768) 2024-07-20 16:20:58 +00:00
comms.py [Traceable FSDP2] Use .copy_ instead of .set_ for unsharded_param inplace update; Replace unsharded_param graph input usage with graph intermediate; Support FSDP2+LoRA (#133730) 2024-09-11 23:01:05 +00:00
compile_fx.py [AOTI] Refactor how cpp_wrapper specific options are set (#136035) 2024-09-16 14:32:13 +00:00
config.py Back out "Flip triton kernel default layout constraint to "needs_fixed_stride_order" (#135581)" (#136160) 2024-09-17 01:06:10 +00:00
constant_folding.py [AO][Inductor] Enable WOQ fusion pattern with permute (#135928) 2024-09-18 00:56:16 +00:00
cpp_builder.py [Inductor] Generalize is_cuda to specific device_type to make cpp_wrapper mode be extensible (#134693) 2024-09-10 10:11:13 +00:00
cpu_vec_isa.py Revise CPU vectorization ISA support API (#135075) 2024-09-05 12:14:56 +00:00
cudagraph_trees.py [EASY] Typofix (#135022) 2024-09-04 01:59:40 +00:00
cudagraph_utils.py [CUDAGraph] Warn once if too many distinct sizes (#132832) 2024-08-07 19:48:06 +00:00
debug.py [inductor][debug] fix draw_buffers (#135266) 2024-09-06 04:12:41 +00:00
decomposition.py Change wrapped_linear_prepack and wrapped_quantized_linear_prepacked to private by adding _ as prefix (#135401) 2024-09-08 04:16:24 +00:00
dependencies.py [BE]: Update mypy to 1.11.2 (#133816) 2024-09-16 19:44:11 +00:00
exc.py Add basic mypy annotations to inductor (#132416) 2024-08-04 18:43:37 +00:00
extern_node_serializer.py [BE][Easy][14/19] enforce style for empty lines in import segments in torch/_[a-c]*/ and torch/_[e-h]*/ and torch/_[j-z]*/ (#129765) 2024-07-31 10:42:50 +00:00
freezing.py Don't use _disable_current_modes as decorator (#132809) 2024-08-07 23:59:46 +00:00
fx_utils.py Add basic mypy annotations to inductor (#132416) 2024-08-04 18:43:37 +00:00
graph.py [AOTI] Refactor how cpp_wrapper specific options are set (#136035) 2024-09-16 14:32:13 +00:00
hooks.py [BE][Easy][16/19] enforce style for empty lines in import segments in torch/_i*/ (#129768) 2024-07-20 16:20:58 +00:00
index_propagation.py Add try except for _maybe_evaluate_static call in IndexPropagation (#132128) 2024-08-05 01:02:51 +00:00
inductor_prims.py inductor: dont use default_dtype during rng functionalization (#136041) 2024-09-17 03:40:54 +00:00
ir.py [inductor] Cleanup analysis done at lowering time (#135412) 2024-09-08 18:02:36 +00:00
jagged_lowerings.py Revert "[BE] typing for decorators - _inductor/lowering (#131574)" 2024-07-28 03:29:32 +00:00
loop_body.py [inductor] Use TracerBase directly in LoopBody (#135820) 2024-09-13 00:18:41 +00:00
lowering.py [effects] Turn off dtype promotion for with_effects lowering (#136039) 2024-09-16 16:14:05 +00:00
metrics.py [inductor] move loop ordering after fusion (#126254) 2024-08-29 21:50:07 +00:00
mkldnn_ir.py [AOTI] Add C shim for aten.mkldnn_rnn_layer in cpp wrapper (#134857) 2024-09-09 16:54:12 +00:00
mkldnn_lowerings.py Inductor-CPU WoQ int8 GEMM micro-kernel with scale epilogue (#131887) 2024-08-14 03:14:45 +00:00
ops_handler.py [inductor] Cleanup analysis done at lowering time (#135412) 2024-09-08 18:02:36 +00:00
optimize_indexing.py [inductor] Move LoopBody to its own file (#135257) 2024-09-07 16:29:15 +00:00
pattern_matcher.py [inductor] Refactor simplify erase_nodes() (#134822) 2024-09-04 17:32:07 +00:00
quantized_lowerings.py Inductor-CPU WoQ int8 GEMM micro-kernel with scale epilogue (#131887) 2024-08-14 03:14:45 +00:00
remote_cache.py [BE]: Update mypy to 1.11.2 (#133816) 2024-09-16 19:44:11 +00:00
scheduler.py [BE]: Update mypy to 1.11.2 (#133816) 2024-09-16 19:44:11 +00:00
script.ld
select_algorithm.py Revert "Add Triton CPU as an Inductor backend (#133408)" 2024-09-16 18:33:33 +00:00
sizevars.py [BE] Fix MYPY issues (#133872) 2024-08-20 16:12:04 +00:00
subgraph_lowering.py [BE] typing for decorators - fx/_compatibility (part 1) (#134202) 2024-08-22 17:07:33 +00:00
test_case.py [BE][Easy][16/19] enforce style for empty lines in import segments in torch/_i*/ (#129768) 2024-07-20 16:20:58 +00:00
test_operators.py [BE][Easy][16/19] enforce style for empty lines in import segments in torch/_i*/ (#129768) 2024-07-20 16:20:58 +00:00
utils.py Revert "Add Triton CPU as an Inductor backend (#133408)" 2024-09-16 18:33:33 +00:00
virtualized.py [inductor] Move LoopBody to its own file (#135257) 2024-09-07 16:29:15 +00:00
wrapper_benchmark.py remove fast_flush arguments (#135387) 2024-09-13 08:13:46 +00:00