pytorch/torch/utils
Edward Z. Yang 326db8af4c Replace sympy Min/Max with reimplementations (#133319)
Sympy's implementation of Min/Max displays asymptotically bad behavior on `TORCH_COMPILE_CPROFILE=1 python torchrec/distributed/tests/test_pt2_multiprocess.py TestPt2Train.test_compile_multiprocess`. Evidence profile:

![image](https://github.com/user-attachments/assets/142301e9-3a18-4370-b9db-19b32ece7ee8)

On this test case, we spend 42% of all time compiling the network on ShapeEnv.replace, which in turn spends all of its time in xreplace.

The problem appears to be find_localzeros call. By vendoring the implementations of Min/Max, we can potentially reduce the cost of this operation.

The implementation is copy-pasted sympy/functions/elementary/miscellaneous.py but with some adjustments:

* I deleted logic related to differentatiation, evalf and heaviside, as it's not relevant to PyTorch reasoning
* There's some massaging to appease PyTorch's linters, including a lot of noqa and type: ignore (which I could potentially refactor away with substantive changes, but that's better as its own change)
* I deleted the second loop iteration for is_connected, as an attempt at initial optimization (this also simplifies the port, since I can omit some code). I'll comment at that point what the exact difference is.

Before this change, the test in question takes 100s with 40 features; post this change, afterwards, it takes only 69s.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133319
Approved by: https://github.com/Skylion007
2024-08-25 05:05:59 +00:00
..
_strobelight [BE][Easy][19/19] enforce style for empty lines in import segments in torch/[o-z]*/ (#129771) 2024-08-01 17:07:14 +00:00
_sympy Replace sympy Min/Max with reimplementations (#133319) 2024-08-25 05:05:59 +00:00
backcompat
benchmark [BE] enable UFMT for torch/storage.py (#127706) 2024-06-27 23:16:24 +00:00
bottleneck
data [BE][Easy] enable ruff rule PIE790: unnecessary pass statement (#133200) 2024-08-15 15:50:19 +00:00
hipify [Reland2] Update NVTX to NVTX3 (#109843) 2024-08-20 16:33:26 +00:00
jit
model_dump [BE][Easy] replace import pathlib with from pathlib import Path (#129426) 2024-06-30 01:36:07 +00:00
tensorboard
viz [Memory Snapshot][Viz] Add Allocator Settings Tab (#132518) 2024-08-13 17:35:12 +00:00
__init__.py [BE] enable UFMT in torch.utils.data (#127705) 2024-06-27 23:16:24 +00:00
_backport_slots.py [BE][Easy][19/19] enforce style for empty lines in import segments in torch/[o-z]*/ (#129771) 2024-08-01 17:07:14 +00:00
_config_module.py allow SubConfigProxy of arbitrary depth (#133418) 2024-08-14 18:43:00 +00:00
_config_typing.pyi Use Generic TypeAlias (PEP 585) and Union Type (PEP 604) in .pyi stub files (#129419) 2024-06-29 09:23:39 +00:00
_content_store.py [BE][Easy][19/19] enforce style for empty lines in import segments in torch/[o-z]*/ (#129771) 2024-08-01 17:07:14 +00:00
_contextlib.py
_cpp_extension_versioner.py
_cxx_pytree.py [BE][Easy][19/19] enforce style for empty lines in import segments in torch/[o-z]*/ (#129771) 2024-08-01 17:07:14 +00:00
_device.py Fix DeviceContext bug (#133729) 2024-08-20 07:14:37 +00:00
_exposed_in.py Revert "[BE] typing for decorators - _library/custom_ops (#131578)" 2024-07-28 03:29:32 +00:00
_foreach_utils.py
_freeze.py [BE] mypy: disallow untyped decorators (#131428) 2024-07-23 21:50:55 +00:00
_get_clean_triton.py Fix edge case in inductor triton clean script (#130837) 2024-08-19 23:46:11 +00:00
_import_utils.py
_mode_utils.py
_ordered_set.py [BE] Fix MYPY issues (#133872) 2024-08-20 16:12:04 +00:00
_python_dispatch.py [BE] typing for decorators - fx/_compatibility (part 1) (#134202) 2024-08-22 17:07:33 +00:00
_pytree.py [pytree] Only import optree if it's used (#131478) 2024-07-24 00:10:49 +00:00
_stats.py
_thunk.py Refactor thunkify to return proper thunk abstraction (#132407) 2024-08-06 02:35:45 +00:00
_traceback.py
_triton.py Remove AMD restrictions on triton hashing (#133616) 2024-08-16 08:02:48 +00:00
_typing_utils.py [BE][Easy][19/19] enforce style for empty lines in import segments in torch/[o-z]*/ (#129771) 2024-08-01 17:07:14 +00:00
_zip.py
backend_registration.py
bundled_inputs.py
checkpoint.py [Torch] Support meta device in checkpoint (#132684) 2024-08-06 20:45:50 +00:00
collect_env.py
cpp_backtrace.py
cpp_extension.py Fix Extension attribute name in CppExtension example (#134046) 2024-08-21 13:58:16 +00:00
deterministic.py
dlpack.py
file_baton.py
flop_counter.py [BE][Ez]: FURB142,FURB92 misc preview fixes (#133880) 2024-08-21 13:54:51 +00:00
hooks.py
mkldnn.py
mobile_optimizer.py
model_zoo.py
module_tracker.py [BE][Easy][19/19] enforce style for empty lines in import segments in torch/[o-z]*/ (#129771) 2024-08-01 17:07:14 +00:00
show_pickle.py
throughput_benchmark.py
weak.py