pytorch/torch/utils
Kefei Lu d2d1258b1b Speed up AMD AOT Inductor lowering by memoizing hipify trie to regex logic (#140156)
Summary:
AMD lowering duration is 1.55x longer than H100. Profiling shows hipification related functions took 22% of overall lowering time.

This diff cuts that time by safely memoize the trie to regex logic. The trick is to incrementally build a state of the trie during the trie construction. The state is the hash of all the words added to the trie.

Differential Revision: D65659445

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140156
Approved by: https://github.com/ColinPeppler

Co-authored-by: Kefei Lu <kefeilu@meta.com>
2024-11-09 04:28:58 +00:00
..
_strobelight Remove unused Python variables in torch/[b-z]* (#136963) 2024-10-19 16:45:22 +00:00
_sympy [inductor] sympy.Integer([01]) -> sympy.S.(Zero|One) (#139523) 2024-11-04 04:28:40 +00:00
backcompat
benchmark Fix torch.load (torch.utils.benchmark) after #137602 (#139810) 2024-11-06 03:08:29 +00:00
bottleneck
data Remove unused Python variables in torch/[b-z]* (#136963) 2024-10-19 16:45:22 +00:00
hipify Speed up AMD AOT Inductor lowering by memoizing hipify trie to regex logic (#140156) 2024-11-09 04:28:58 +00:00
jit Remove unused Python variables in torch/[b-z]* (#136963) 2024-10-19 16:45:22 +00:00
model_dump Remove unused Python variables in torch/[b-z]* (#136963) 2024-10-19 16:45:22 +00:00
tensorboard Remove unused Python variables in torch/[b-z]* (#136963) 2024-10-19 16:45:22 +00:00
viz Remove unused Python variables in torch/[b-z]* (#136963) 2024-10-19 16:45:22 +00:00
__init__.py
_backport_slots.py
_config_module.py Add type annotations to Configs (#139833) 2024-11-07 03:49:09 +00:00
_config_typing.pyi
_content_store.py
_contextlib.py
_cpp_extension_versioner.py Avoid file encoding issues when loading cpp extensions (#138565) 2024-10-28 14:06:34 +00:00
_cxx_pytree.py
_device.py
_exposed_in.py
_foreach_utils.py
_freeze.py Remove unused Python variables in torch/[b-z]* (#136963) 2024-10-19 16:45:22 +00:00
_get_clean_triton.py
_import_utils.py
_mode_utils.py
_ordered_set.py
_python_dispatch.py [BE]: Update Typeguard to TypeIs for better type inference (#133814) 2024-10-26 15:07:13 +00:00
_pytree.py [dynamo] support maxlen for collections.deque (#138194) 2024-10-30 10:08:02 +00:00
_stats.py
_thunk.py
_traceback.py Remove unused Python variables in torch/[b-z]* (#136963) 2024-10-19 16:45:22 +00:00
_triton.py Revert "[user triton] typing triton_kernel_wrap.py (#138230)" 2024-10-18 23:12:29 +00:00
_typing_utils.py
_zip.py
backend_registration.py
bundled_inputs.py
checkpoint.py
collect_env.py Update wmic command used in collect_env.py to its counterpart in powershell due to its deprecation (#138297) 2024-10-18 07:03:17 +00:00
cpp_backtrace.py
cpp_extension.py Enable Windows Arm64 (#133088) 2024-10-24 16:10:44 +00:00
deterministic.py
dlpack.py
file_baton.py
flop_counter.py FlopCounterMode: Decompose ops for inference mode (#138508) 2024-11-09 03:13:53 +00:00
hooks.py
mkldnn.py
mobile_optimizer.py
model_zoo.py
module_tracker.py
show_pickle.py
throughput_benchmark.py
weak.py