pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Edward Z. Yang	323fb4dad0	Unconditionally exclude upper bound in all size oblivious tests (#144867 ) I was thinking about https://github.com/pytorch/pytorch/pull/144471 some more and I thought, "Hmm, why not just always exclude the constant upper bound." So here it is. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/144867 Approved by: https://github.com/bobrenjc93	2025-01-21 20:44:09 +00:00
Jason Ansel	505ade7471	[inductor] Simplify mode options, only apply CompilerBisector changes once (#145232 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/145232 Approved by: https://github.com/yanboliang	2025-01-21 19:25:46 +00:00
Jason Ansel	4eea2f7496	[inductor] Fix ignored options for torch.compile (#145131 ) #139833 broke `torch.compile(options=...)` so that many (all?) options passed in get completely ignored. @alexreinking pointed this out when `options={"cpu_backend":"halide"}` did nothing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145131 Approved by: https://github.com/exclamaforte	2025-01-18 03:39:49 +00:00
PyTorch MergeBot	d21738f24a	Revert "Fix torch.normal ignores default_device (#144070 )" This reverts commit `184549b2d7`. Reverted https://github.com/pytorch/pytorch/pull/144070 on behalf of https://github.com/ezyang due to broken a specific use case ([comment](https://github.com/pytorch/pytorch/pull/144070#issuecomment-2590681953))	2025-01-14 17:41:58 +00:00
Nikita Shulga	f2975717f3	[CD] Fix slim-wheel nvjit-link import problem (#141063 ) When other toolkit (say CUDA-12.3) is installed and `LD_LIBRARY_PATH` points to there, import torch will fail with ``` ImportError: /usr/local/lib/python3.10/dist-packages/torch/lib/../../nvidia/cusparse/lib/libcusparse.so.12: undefined symbol: __nvJitLinkComplete_12_4, version libnvJitLink.so.12 ``` It could not be worked around by tweaking rpath, as it also depends on the library load order, which are not guaranteed by any linker. Instead solve this by preloading `nvjitlink` right after global deps are loaded, by running something along the lines of the following ```python if version.cuda in ["12.4", "12.6"]: with open("/proc/self/maps") as f: _maps = f.read() # libtorch_global_deps.so always depends in cudart, check if its installed via wheel if "nvidia/cuda_runtime/lib/libcudart.so" in _maps: # If all abovementioned conditions are met, preload nvjitlink _preload_cuda_deps("nvjitlink", "libnvJitLink.so.*[0-9]") ``` Fixes https://github.com/pytorch/pytorch/issues/140797 Pull Request resolved: https://github.com/pytorch/pytorch/pull/141063 Approved by: https://github.com/kit1980 Co-authored-by: Sergii Dymchenko <sdym@meta.com>	2025-01-14 17:33:07 +00:00
Edward Z. Yang	ffb3f32693	Add max kwarg to torch._check with alternate size oblivious semantics (#144471 ) Fixes https://github.com/pytorch/pytorch/issues/120288 for the static bound case I had been tying myself in knots in the original issue about the fact that we can't really do symbolic bounds like u0 < s0. But then I realized, "Wait, but the static bounds are easy!" So this makes it so you can also exclude a specific upper bound when doing size oblivious tests, which is enough to solve https://github.com/pytorch/pytorch/issues/123592#issuecomment-2574556708 It's written very dirtily, maybe there's some cleanup. Bikeshed on the public API name also welcome. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/144471 Approved by: https://github.com/avikchaudhuri	2025-01-14 15:10:57 +00:00
zeshengzong	184549b2d7	Fix torch.normal ignores default_device (#144070 ) Fixes #122886 1. Enable `torch.normal` working with `DeviceContext` to get default device which set via `set_default_device`. 2. Add hint in `set_default_device` doc, suggest use `torch.Tensor.to` method move to desired device explicitly. Test Result 1. Doc Preview ![image](https://github.com/user-attachments/assets/eb69c334-be2b-4dc5-bdce-567da21e1635) 2. Local Test ```python >>> import torch >>> torch.normal(0.,1., (10,10)).device device(type='cpu') >>> torch.set_default_device('cuda') >>> torch.normal(0.,1., (10,10)).device device(type='cuda', index=0) ``` ```bash pytest test/test_tensor_creation_ops.py ``` ![image](https://github.com/user-attachments/assets/8b466b55-f162-4b83-8b20-71de2c1d0914) ```bash lintrunner ``` ![image](https://github.com/user-attachments/assets/5b269c50-da57-47ed-8500-4edf2c2295e4) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144070 Approved by: https://github.com/ezyang	2025-01-10 08:19:55 +00:00
atalman	2b241a8206	Amazon Linux 2023: Preload cusparseLt.so (#144477 ) Fixes https://github.com/pytorch/pytorch/issues/144433 Test with some debug statements added: ``` >>> import torch trying to load libcublas.so.[0-9] from ['/usr/local/lib/python3.9/site-packages/nvidia/cublas/lib/libcublas.so.12'] trying to load libcublas.so.[0-9] from /usr/local/lib/python3.9/site-packages/nvidia/cublas/lib/libcublas.so.12 trying to load libcudnn.so.[0-9] from ['/usr/local/lib/python3.9/site-packages/nvidia/cudnn/lib/libcudnn.so.9'] trying to load libcudnn.so.[0-9] from /usr/local/lib/python3.9/site-packages/nvidia/cudnn/lib/libcudnn.so.9 trying to load libnvrtc.so.[0-9] from ['/usr/local/lib/python3.9/site-packages/nvidia/cuda_nvrtc/lib/libnvrtc.so.12'] trying to load libnvrtc.so.[0-9] from /usr/local/lib/python3.9/site-packages/nvidia/cuda_nvrtc/lib/libnvrtc.so.12 trying to load libcudart.so.[0-9] from ['/usr/local/lib/python3.9/site-packages/nvidia/cuda_runtime/lib/libcudart.so.12'] trying to load libcudart.so.[0-9] from /usr/local/lib/python3.9/site-packages/nvidia/cuda_runtime/lib/libcudart.so.12 trying to load libcupti.so.[0-9] from ['/usr/local/lib/python3.9/site-packages/nvidia/cuda_cupti/lib/libcupti.so.12'] trying to load libcupti.so.[0-9] from /usr/local/lib/python3.9/site-packages/nvidia/cuda_cupti/lib/libcupti.so.12 trying to load libcufft.so.[0-9] from ['/usr/local/lib/python3.9/site-packages/nvidia/cufft/lib/libcufft.so.11'] trying to load libcufft.so.[0-9] from /usr/local/lib/python3.9/site-packages/nvidia/cufft/lib/libcufft.so.11 trying to load libcurand.so.[0-9] from ['/usr/local/lib/python3.9/site-packages/nvidia/curand/lib/libcurand.so.10'] trying to load libcurand.so.[0-9] from /usr/local/lib/python3.9/site-packages/nvidia/curand/lib/libcurand.so.10 trying to load libnvJitLink.so.[0-9] from ['/usr/local/lib/python3.9/site-packages/nvidia/nvjitlink/lib/libnvJitLink.so.12'] trying to load libnvJitLink.so.[0-9] from /usr/local/lib/python3.9/site-packages/nvidia/nvjitlink/lib/libnvJitLink.so.12 trying to load libcusparse.so.[0-9] from ['/usr/local/lib/python3.9/site-packages/nvidia/cusparse/lib/libcusparse.so.12'] trying to load libcusparse.so.[0-9] from /usr/local/lib/python3.9/site-packages/nvidia/cusparse/lib/libcusparse.so.12 trying to load libcusparseLt.so.[0-9] from [] trying to load libcusparseLt.so.[0-9] from /usr/local/lib/python3.9/site-packages/cusparselt/lib/libcusparseLt.so.0 trying to load libcusolver.so.[0-9] from ['/usr/local/lib/python3.9/site-packages/nvidia/cusolver/lib/libcusolver.so.11'] trying to load libcusolver.so.[0-9] from /usr/local/lib/python3.9/site-packages/nvidia/cusolver/lib/libcusolver.so.11 trying to load libnccl.so.[0-9] from ['/usr/local/lib/python3.9/site-packages/nvidia/nccl/lib/libnccl.so.2'] trying to load libnccl.so.[0-9] from /usr/local/lib/python3.9/site-packages/nvidia/nccl/lib/libnccl.so.2 trying to load libnvToolsExt.so.[0-9] from ['/usr/local/lib/python3.9/site-packages/nvidia/nvtx/lib/libnvToolsExt.so.1'] trying to load libnvToolsExt.so.[0-9] from /usr/local/lib/python3.9/site- packages/nvidia/nvtx/lib/libnvToolsExt.so.1 /usr/local/lib64/python3.9/site-packages/torch/_subclasses/functional_tensor.py:275: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:81.) cpu = _conversion_method_template(device=torch.device("cpu")) >>> exit() ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/144477 Approved by: https://github.com/Skylion007, https://github.com/nWEIdia	2025-01-09 20:04:11 +00:00
William Wen	f700035090	[3.13t] use sysconfig to check for Python nogil builds (#144361 ) `sys._is_gil_enabled()` wasn't working in certain cases, according to @atalman Pull Request resolved: https://github.com/pytorch/pytorch/pull/144361 Approved by: https://github.com/atalman	2025-01-08 13:00:32 +00:00
Oguz Ulgen	dc55704b48	Rename cache limit to recompile limit in configs (#143709 ) This PR renames every cache_limit to recompile_limit via sed. Old config options are maintained via Config(alias='xyz') Pull Request resolved: https://github.com/pytorch/pytorch/pull/143709 Approved by: https://github.com/jansel	2024-12-22 10:03:57 +00:00
William Wen	e1e83015d2	[dynamo, 3.13t] raise error if torch.compile is attempted in 3.13t (nogil) (#143404 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/143404 Approved by: https://github.com/colesbury, https://github.com/atalman	2024-12-19 18:10:01 +00:00
Yukio Siraichi	f8c212a925	Transform unbacked int expressions into a fresh unbacked int. (#141917 ) Fix: #141419 This PR introduces the `torch.sym_fresh_size` API, which transforms an unbacked int expression into a fresh unbacked int. Pull Request resolved: https://github.com/pytorch/pytorch/pull/141917 Approved by: https://github.com/ezyang	2024-12-05 16:53:44 +00:00
William Wen	416f500bfe	[CI, 3.13] enable 3.13 CI (#139533 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/139533 Approved by: https://github.com/atalman, https://github.com/malfet ghstack dependencies: #141409, #142003, #141572, #141577, #141605, #141621, #141623, #141673, #141674, #141858, #141862	2024-12-05 00:25:03 +00:00
William Wen	ee7eaad5c3	[dynamo] add SymNode bitwise and/or (#138777 ) Fixes [T203472723](https://www.internalfb.com/intern/tasks/?t=203472723) Pull Request resolved: https://github.com/pytorch/pytorch/pull/138777 Approved by: https://github.com/ezyang	2024-11-22 23:36:16 +00:00
PyTorch MergeBot	2239d1a7a3	Revert "[CI, 3.13] enable 3.13 CI (#139533 )" This reverts commit `b7a25c1ee7`. Reverted https://github.com/pytorch/pytorch/pull/139533 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it is failing test_cpp_extensions_open_device_registration. The test was wrongly excluded by TD ([comment](https://github.com/pytorch/pytorch/pull/139533#issuecomment-2494328806))	2024-11-22 17:18:49 +00:00
William Wen	b7a25c1ee7	[CI, 3.13] enable 3.13 CI (#139533 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/139533 Approved by: https://github.com/atalman, https://github.com/malfet	2024-11-22 14:43:02 +00:00
PyTorch MergeBot	c1fe6be202	Revert "[dynamo] add SymNode bitwise and/or (#138777 )" This reverts commit `c98ef0279e`. Reverted https://github.com/pytorch/pytorch/pull/138777 on behalf of https://github.com/ezyang due to triggering AssertionError: Guard check failed: 14/2: name 'BitwiseFn_bitwise_or' is not defined ([comment](https://github.com/pytorch/pytorch/pull/138777#issuecomment-2477477776))	2024-11-14 21:52:40 +00:00
William Wen	c98ef0279e	[dynamo] add SymNode bitwise and/or (#138777 ) Fixes [T203472723](https://www.internalfb.com/intern/tasks/?t=203472723) Pull Request resolved: https://github.com/pytorch/pytorch/pull/138777 Approved by: https://github.com/ezyang	2024-11-13 18:31:06 +00:00
iremyux	dd79d2f5e7	Removing warning for Windows Arm64 (#139746 ) This PR removes the warning message on Windows on Arm64, which was triggered by an issue in one of the DLLs, to improve the user experience. `Microsoft Visual C++ Redistributable is not installed, this may lead to the DLL load failure. It can be downloaded at https://aka.ms/vs/16/release/vc_redist.x64.exe` The issue is being tracked here: https://developercommunity.visualstudio.com/t/VCRUNTIME140_1DLL-Miscompiled-for-Arm64/10781635? Pull Request resolved: https://github.com/pytorch/pytorch/pull/139746 Approved by: https://github.com/malfet Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>	2024-11-08 16:23:59 +00:00
Gabriel Ferns	2037ea3e15	Add type annotations to Configs (#139833 ) Summary: Adds types to Configs, and fixes a bug in options that was caused by the lack of types. fixes: https://github.com/pytorch/pytorch/issues/139822 Configs are used by many modules so not sure which label to put. Types also allow https://github.com/pytorch/pytorch/pull/139736 to fuzz configs Pull Request resolved: https://github.com/pytorch/pytorch/pull/139833 Approved by: https://github.com/c00w	2024-11-07 03:49:09 +00:00
Bob Ren	fdd298dcb7	add hex method on SymFloat (#139451 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/139451 Approved by: https://github.com/ezyang	2024-11-02 05:33:19 +00:00
eellison	ee2f8a50d3	Class rename (#139490 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/139490 Approved by: https://github.com/exclamaforte, https://github.com/zou3519 ghstack dependencies: #139295	2024-11-02 00:10:17 +00:00
Bob Ren	74b7fb9519	Add conjugate method on SymFloat (#139249 ) Fixes python test/dynamo/test_dynamic_shapes.py DynamicShapesFunctionTests.test_number_method_method_conjugate_num_type4_dynamic_shapes when we turn off specialize float on eager: https://github.com/pytorch/pytorch/pull/138915 Pull Request resolved: https://github.com/pytorch/pytorch/pull/139249 Approved by: https://github.com/ezyang	2024-10-31 04:55:36 +00:00
PyTorch MergeBot	42d790bb65	Revert "Add conjugate method on SymFloat (#139249 )" This reverts commit `bcf8a0124f`. Reverted https://github.com/pytorch/pytorch/pull/139249 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but the doc build failure is legit ([comment](https://github.com/pytorch/pytorch/pull/139249#issuecomment-2448755839))	2024-10-31 00:45:48 +00:00
Bob Ren	bcf8a0124f	Add conjugate method on SymFloat (#139249 ) Fixes python test/dynamo/test_dynamic_shapes.py DynamicShapesFunctionTests.test_number_method_method_conjugate_num_type4_dynamic_shapes when we turn off specialize float on eager: https://github.com/pytorch/pytorch/pull/138915 Pull Request resolved: https://github.com/pytorch/pytorch/pull/139249 Approved by: https://github.com/ezyang	2024-10-30 23:28:09 +00:00
PyTorch MergeBot	49bfbed2eb	Revert "Add deterministic path for CUDA `cumsum` (#136224 )" This reverts commit `383eba5229`. Reverted https://github.com/pytorch/pytorch/pull/136224 on behalf of https://github.com/ezyang due to larger memory usage apparently not acceptable ([comment](https://github.com/pytorch/pytorch/pull/136224#issuecomment-2447382819))	2024-10-30 14:43:15 +00:00
Nikita Shulga	bd369bb182	Workaround torch.deploy failures (#139195 ) Summary: Which are backed with an older version of `typing_extensoins` but this runtime could not care less about type-checking. So pretend that is has `TypeIs` by replacing it with `TypeGuard` Fixes test failures introduced by https://github.com/pytorch/pytorch/pull/133814 / D65030974 Test Plan: `buck2 test 'fbcode//mode/opt' fbcode//multipy/runtime:test_deploy -- --exact 'multipy/runtime:test_deploy - TorchpyTest.TestNumpy'` Differential Revision: D65145409 Pull Request resolved: https://github.com/pytorch/pytorch/pull/139195 Approved by: https://github.com/Skylion007	2024-10-29 23:36:16 +00:00
Edward Z. Yang	91ded0576d	Add sym_log2 (#137980 ) Internal xref: https://fb.workplace.com/groups/1075192433118967/permalink/1515595595745313/ Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/137980 Approved by: https://github.com/bobrenjc93	2024-10-28 17:03:14 +00:00
PyTorch MergeBot	2487a834a4	Revert "Add sym_log2 (#137980 )" This reverts commit `5d450d7fac`. Reverted https://github.com/pytorch/pytorch/pull/137980 on behalf of https://github.com/jeanschmidt due to lint broke from this onwards on main ([comment](https://github.com/pytorch/pytorch/pull/137980#issuecomment-2441570186))	2024-10-28 13:21:08 +00:00
Edward Z. Yang	5d450d7fac	Add sym_log2 (#137980 ) Internal xref: https://fb.workplace.com/groups/1075192433118967/permalink/1515595595745313/ Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/137980 Approved by: https://github.com/bobrenjc93	2024-10-28 03:09:11 +00:00
Yu, Guangye	40c098f731	Introduce a device-agnostic runtime API design (#132204 ) # Motivation According to [[RFC]A device-agnostic Python runtime API design for stream-based accelerators](https://github.com/pytorch/pytorch/issues/128403), this PR intends to introduce a device-agnostic runtime API design. I personally prefer the Simple Version APIs that no longer accept the device type as an input argument. It means we will leverage `getAccelerator` to fetch the current accelerator. And it is flexible to expand these APIs to handle multiple types of accelerator scenarios. The design does NOT break the previous design philosophies. I also believe that namespace torch.accelerator is better. It lets users know that the APIs they are calling are running on an accelerator rather than CPU. This is important. Meanwhile, we can follow a simple API design principle: 1. Device-agnostic APIs should be placed under the torch.accelerator namespace and not accept a device_type optional parameter. 2. Device-specific APIs should be placed under device-specific submodules. 3. APIS required by both CPU and accelerators should be placed under the torch namespace and accept a device_type optional parameter. Also, I list the pros and cons of Simple Version here: Pros: - `torch.accelerator.foo` will have the same input argument as `torch.xxx.foo`, bringing a better user experience; - more concise, facilitate the developer to write a device-agnostic code. Cons: - no obvious drawbacks. # Additional Context I list the new APIs here: ```python torch.accelerator.is_available() -> bool: torch.accelerator.current_accelerator() -> torch.device: torch.accelerator.device_count() -> int: torch.accelerator.current_device_idx() -> int: torch.accelerator.set_device_idx(device: Union[torch.device, str, int, None]) -> None: torch.accelerator.current_stream(device: Union[torch.device, str, int, None]) -> torch.Stream: torch.accelerator.set_stream(stream: torch.Stream) -> None: torch.accelerator.synchronize(device: Union[torch.device, str, int, None]) -> None: ``` According to the discussion with Alban, we decide to change the API name `set_device` to `set_device_idx` and `current_device` to `current_device_idx` for more explicit. And will submit other PR to support device and stream context manager. Pull Request resolved: https://github.com/pytorch/pytorch/pull/132204 Approved by: https://github.com/EikanWang, https://github.com/abhilash1910, https://github.com/gujinghui, https://github.com/albanD	2024-10-27 10:37:09 +00:00
Aaron Gokaslan	49ed365b22	[BE]: Update Typeguard to TypeIs for better type inference (#133814 ) Uses TypeIs instead of TypeGuard for better inference. See https://peps.python.org/pep-0742/ Pull Request resolved: https://github.com/pytorch/pytorch/pull/133814 Approved by: https://github.com/ezyang	2024-10-26 15:07:13 +00:00
Laith Sakka	ed313a5ca2	Introduce torch.sym_add, variadic add (#138660 ) Tested internally here: https://www.internalfb.com/diff/D64057744 This is a reland after previous internal failures. main change is ``` if min is None and max is None: torch._check_is_size(size) return ``` Partially addresses https://github.com/pytorch/pytorch/issues/128150 When you have big sums of values, we end up computing long chains of binary addition in our FX graph representation. Not only is this ugly, it also is quadratic, as the sympy.Add constructor is O(N) in number of arguments. Instead, ensure that we maintain the summation as a single FX node so we can do the entire addition all in one go. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/138660 Approved by: https://github.com/ezyang, https://github.com/bobrenjc93	2024-10-23 17:42:41 +00:00
Colin L. Rice	bb8bc7d6b3	config: simplify most of the config handling and fix some bugs (#138377 ) This PR combines a number of cleanups in one PR. If any of the specific cleanups don't seem to make sense, let me know and I can remove them. Cleanups - This PR adds a set of test suites for the config module code, which handles basically all the APIs and ways it is used. Please let me know if you see anything critical that is not tested that I missed. This test suite is primarily used as the regression test suite for later changes in this diff. Note that there is some dynamo specific testing of the config module, but it isn't as verbose. - I removed all internal usage of shallow_copy_dict. Those usages could all use the deep copy, and did not depend on the reference behavior of certain config values that shallow_copy_dict allows. - I removed shallow copy semantics for configuration with a deprecation warning. I think this requires a release note, so hopefully I did that correctly. Let me know if we want to continue to expose shallow copy value semantics, but I just can't find a case where I expect anyone would want it. It also complicated later internal changes to the API (i.e. breaking apart various layers of the config changes). - I fixed what I believe is a bug in how hashes are calculated on configs. In particular, if you got the hash, then made a config change, and then got the hash again, it would not update the hash. @oulgen, please let me know if I'm misunderstanding this behavior and it is desired. - I switched our multiple implementations of iterating through the dictionary to a single one. This is primarily to make later changes easier, but it also makes it clear how inconsistent our various config ignoring options are. Let me know if people would be interested in me unifying the various options for ignoring config values. - I updated the test patcher (not the performance critical one, just the normal one), to use __setattr__ and __getattr__ to remove direct API access to the underlying config fetcher. For release notes, Not sure exactly how to communicate this, but something like "ConfigModule.to_dict, and ConfigModule.shallow_copy_dict no longer retain their shallow copy semantics, which allowed reference values objects to be modified. If you wish to modify the config object, call load_config explicitly". Pull Request resolved: https://github.com/pytorch/pytorch/pull/138377 Approved by: https://github.com/ezyang, https://github.com/jansel, https://github.com/jovianjaison	2024-10-22 13:40:26 +00:00
Edward Z. Yang	1b61313acd	Add type stub for SymInt.rsub (#138543 ) Fixes https://github.com/pytorch/pytorch/issues/138478 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/138543 Approved by: https://github.com/malfet	2024-10-22 13:27:32 +00:00
Sergii Dymchenko	012ff2a0aa	Don't try to load cufile (#138501 ) Trying to loading it caused a big issue with 2.5.0 release - https://github.com/pytorch/pytorch/issues/138324 cufile is not actually used currently by default, see #133489 Pull Request resolved: https://github.com/pytorch/pytorch/pull/138501 Approved by: https://github.com/atalman, https://github.com/mikaylagawarecki, https://github.com/malfet	2024-10-22 01:13:27 +00:00
PyTorch MergeBot	32d4582e02	Revert "[BE]: Update Typeguard to TypeIs for better type inference (#133814 )" This reverts commit `16caa8c1b3`. Reverted https://github.com/pytorch/pytorch/pull/133814 on behalf of https://github.com/jeanschmidt due to checking if this will solve inductor errors ([comment](https://github.com/pytorch/pytorch/pull/133814#issuecomment-2427565425))	2024-10-21 19:40:58 +00:00
Xuehai Pan	abbd71d29d	[BE][Easy] enable PYFMT for `torch.fx` (#138443 ) Reproduce command: ```bash ghstack checkout https://github.com/pytorch/pytorch/pull/138443 git checkout HEAD~1 torch/ lintrunner -a --take "PYFMT" --all-files ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/138443 Approved by: https://github.com/ezyang	2024-10-21 19:15:49 +00:00
Aaron Gokaslan	16caa8c1b3	[BE]: Update Typeguard to TypeIs for better type inference (#133814 ) Uses TypeIs instead of TypeGuard for better inference. See https://peps.python.org/pep-0742/ Pull Request resolved: https://github.com/pytorch/pytorch/pull/133814 Approved by: https://github.com/ezyang	2024-10-21 17:20:06 +00:00
PyTorch MergeBot	4557f6e339	Revert "[Dynamo] Disable torch function compilation during guard execution and in compiled bytecode (#137669 )" This reverts commit `bf0b670598`. Reverted https://github.com/pytorch/pytorch/pull/137669 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but it is failing test_public_bindings in trunk, maybe a landrace ([comment](https://github.com/pytorch/pytorch/pull/137669#issuecomment-2415331274))	2024-10-15 23:22:58 +00:00
Michael Lazos	bf0b670598	[Dynamo] Disable torch function compilation during guard execution and in compiled bytecode (#137669 ) Fixes https://github.com/pytorch/pytorch/issues/114369 Pull Request resolved: https://github.com/pytorch/pytorch/pull/137669 Approved by: https://github.com/anijain2305	2024-10-15 20:52:58 +00:00
eellison	8543000c27	Search through config changes in compiler bisector (#137346 ) Follow up to https://github.com/pytorch/pytorch/pull/131936. In the original bisector you'd have to test inline if we were disabling a component - `if BisectionManager.disable_subsystem("inductor", "post_grad_passes", debug_info)`. This adds a convenient way of testing config changes for root causing issue. I've added `emulate_precision_casts` and aot_eager_decomp_partition cse as initial ones. Pull Request resolved: https://github.com/pytorch/pytorch/pull/137346 Approved by: https://github.com/zou3519	2024-10-11 20:24:54 +00:00
Kurt Mohler	383eba5229	Add deterministic path for CUDA `cumsum` (#136224 ) Change `cumsum` to call its decomposition when `use_deterministic_algorithms(True)` and input is CUDA. Fixes #89492 Fixes #75240 Pull Request resolved: https://github.com/pytorch/pytorch/pull/136224 Approved by: https://github.com/ezyang, https://github.com/justinchuby, https://github.com/eqy	2024-10-10 06:59:08 +00:00
eellison	47af7cc962	Add compiler bisector (#131936 ) This is a utility to aid the torch.compile debugging. You provide a function that returns True on success, False on failure, or do something out of process and run bisect_helper `good \| bad`. The bisector will first go through backends - `eager`, `aot_eager`, `aot_eager_decomp_partition`, `inductor` to find the first failing backend. Then, it will go through subsystems within the backend - currently limited but could be expanded - and try to find the first subsystem for which disabling fixes the problem. Once it has found the failing subsystem, it will find the number of times the subsystem is applied, and then bisect through it. An example usage of how to hook it up for aot_eager_decomp_partition and decomposition subsystem is : ``` from torch._inductor.bisect_helper import BisectionManager if op in CURRENT_DECOMPOSITION_TABLE: if BisectionManager.disable_subsystem("aot_eager_decomp_partition", "decomposition", lambda: repr(op)): return NotImplemented ``` Once it has discovered the problematic change, it will print out the associated debug info, and you can set the same limits with `TORCH_BISECT_BACKEND` `TORCH_BISECT_SUBSYSTEM` and `TORCH_BISECT_MAX`. We could add further options as an automated way of going through a check list for checking divergence - e.g., the mode to emulate amp casts. Fix for https://github.com/pytorch/pytorch/issues/126546 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131936 Approved by: https://github.com/ezyang	2024-10-09 20:34:11 +00:00
PyTorch MergeBot	16a2c2cfd4	Revert "Introduce torch.sym_sum (#136429 )" This reverts commit `90bed32b98`. Reverted https://github.com/pytorch/pytorch/pull/136429 on behalf of https://github.com/ezyang due to fails internal stuff ([comment](https://github.com/pytorch/pytorch/pull/136429#issuecomment-2403335147))	2024-10-09 20:08:01 +00:00
Edward Z. Yang	90bed32b98	Introduce torch.sym_sum (#136429 ) Partially addresses https://github.com/pytorch/pytorch/issues/128150 When you have big sums of values, we end up computing long chains of binary addition in our FX graph representation. Not only is this ugly, it also is quadratic, as the sympy.Add constructor is O(N) in number of arguments. Instead, ensure that we maintain the summation as a single FX node so we can do the entire addition all in one go. update_hint_regression benchmark, before and after: ``` update_hint_regression,compile_time_instruction_count,2648328980 update_hint_regression,compile_time_instruction_count,2563748678 ``` Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/136429 Approved by: https://github.com/isuruf	2024-10-08 18:12:57 +00:00
Edward Z. Yang	6bd9d37266	Remove allow-untyped-defs from torch.fx.experimental.symbolic_shapes (#137019 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/137019 Approved by: https://github.com/Skylion007 ghstack dependencies: #136934, #136935, #136972	2024-10-01 13:22:10 +00:00
PyTorch MergeBot	e9d2765ec8	Revert "Add deterministic path for CUDA `cumsum` (#136224 )" This reverts commit `d1bb8e828f`. Reverted https://github.com/pytorch/pytorch/pull/136224 on behalf of https://github.com/atalman due to Break internal CI ([comment](https://github.com/pytorch/pytorch/pull/136224#issuecomment-2379214226))	2024-09-27 12:54:47 +00:00
Edward Z. Yang	beb46de342	Correctly convert Python float to float64 when passing argument as Tensor (#136413 ) I can't actually test the Dynamo codegen fix as it is impossible to directly use the Tensor at the moment. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/136413 Approved by: https://github.com/bobrenjc93 ghstack dependencies: #136599	2024-09-26 16:50:13 +00:00
Kurt Mohler	d1bb8e828f	Add deterministic path for CUDA `cumsum` (#136224 ) Change `cumsum` to call its decomposition when `use_deterministic_algorithms(True)` and input is CUDA. Fixes #89492 Pull Request resolved: https://github.com/pytorch/pytorch/pull/136224 Approved by: https://github.com/ezyang, https://github.com/justinchuby	2024-09-26 04:52:05 +00:00

1 2 3 4 5 ...

618 Commits