pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Bin Bao	365ac49840	[AOTI] Add an option to specify custom op C shim (#153851 ) Summary: Add an option to tell AOTInductor codegen to generate C shim functions for certain custom ops instead of relying on ProxyExecutor. The lib that defines custom ops need to implement corresponding C shim functions. Differential Revision: [D75014177](https://our.internmc.facebook.com/intern/diff/D75014177) Pull Request resolved: https://github.com/pytorch/pytorch/pull/153851 Approved by: https://github.com/hl475	2025-05-20 05:12:09 +00:00
Bin Bao	a2d0ef242d	[AOTI] Embed cubin files into .so (#150739 ) Summary: Embed cubin files so AOTI is one step closer to generate a single binary. Controlled by a flag and off as default. Differential Revision: [D72535357](https://our.internmc.facebook.com/intern/diff/D72535357) Pull Request resolved: https://github.com/pytorch/pytorch/pull/150739 Approved by: https://github.com/angelayi	2025-05-19 01:11:46 +00:00
Benjamin Glass	cda572b053	codecache: Remove cpp_prefix.h duplication per build, then precompile it (#144293 ) Prior to this PR, `_inductor/codegen/cpp_prefix.h` was copied into a new temporary directory on every inductor run utilizing the CPP backend (i.e. CPU-only), then included in the output source code. Instead, this PR puts it in an appropriate place in the torch includes, and includes it from there. This allows us to precompile it in cpp_wrapper and AOT inductor mode, saving significant compilation time. Due to difficulties getting this to work in FBCode, the precompilation itself is only enabled in OSS PyTorch. Differential Revision: [D69420620](https://our.internmc.facebook.com/intern/diff/D69420620) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144293 Approved by: https://github.com/desertfire	2025-05-16 17:41:36 +00:00
Aaron Gokaslan	1c659b5bc0	[BE]: Use more portable shutil.which call for cpp_builder (#153325 ) We should be using shutil.which instead of calling some binary subprocess here for portability and security. Pull Request resolved: https://github.com/pytorch/pytorch/pull/153325 Approved by: https://github.com/xuhancn, https://github.com/cyyever, https://github.com/albanD	2025-05-12 15:15:21 +00:00
Benjamin Glass	b80bb87689	cpp_wrapper: Miscellaneous fixups (#150143 ) 1. Revisit preprocessing code in cpp_bulider.py, removing a hack that channels it through stdout. 2. Fix ops that return None. Differential Revision: [D72053414](https://our.internmc.facebook.com/intern/diff/D72053414) Pull Request resolved: https://github.com/pytorch/pytorch/pull/150143 Approved by: https://github.com/desertfire	2025-04-10 03:31:12 +00:00
Jason Ansel	37ebb0b56a	[inductor] Fix inductor windows linker error (#150256 ) Fixes #149889 Pull Request resolved: https://github.com/pytorch/pytorch/pull/150256 Approved by: https://github.com/anijain2305, https://github.com/eellison	2025-04-01 18:30:55 +00:00
Vlad K	f1b74037b1	Fix bug when Inductor include path contains spaces (#148271 ) This PR fixes a bug with how include directories with spaces are handled on Windows. I ran into an edge case with torch.compile() - it will error out with an exception on Windows. In particular, it will try to execute the following: `cl /I C:/Program Files/Python311/Include ...`, where `C:/Program` will be treated as separate from `Files/Python311/Include`. I looked into using something like `shlex.quote` or `pathlib.Path`, but I didn't find those options to be suitable (shlex is POSIX shell only, pathlib.Path does not escape spaces). There is another place in the function that also deals with escaping spaces. My fix follows the same style. `0ff2e6a85a/torch/_inductor/cpp_builder.py (L1464)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/148271 Approved by: https://github.com/Skylion007 Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com>	2025-03-31 06:46:05 +00:00
Xu Han	bc1b8730a4	[Windows][inductor] fix blank space break windows file path (#149388 ) Fixes #149310 From origin error message: ```cmd Command: cl /I C:/Program Files/Python310/Include /I c:/code/.env/lib/site-packages/torch/include /I c:/code/.env/lib/site-packages/torch/include/torch/csrc/api/include /I c:/code/.env/lib/site-packages/torch/include/TH /I c:/code/.env/lib/site-packages/torch/include/THC /D TORCH_INDUCTOR_CPP_WRAPPER /D STANDALONE_TORCH_HEADER /D C10_USING_CUSTOM_GENERATED_MACROS /DLL /MD /O2 /std:c++20 /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /wd4624 /wd4067 /wd4068 /EHsc /openmp /openmp:experimental C:/Users/user/AppData/Local/Temp/torchinductor_user/ou/coubnfnqsm2gbdzdytufv46jotd6sxsnnhgldiw45pl5yjq5nbvz.cpp /LD /FeC:/Users/user/AppData/Local/Temp/torchinductor_user/ou/coubnfnqsm2gbdzdytufv46jotd6sxsnnhgldiw45pl5yjq5nbvz.pyd /link /LIBPATH:c:/code/.env/Scripts/libs /LIBPATH:c:/code/.env/lib/site-packages/torch/lib torch.lib torch_cpu.lib torch_python.lib sleef.lib Output: Microsoft (R) C/C++ Optimizing Compiler Version 19.43.34809 for x86 Copyright (C) Microsoft Corporation. All rights reserved. cl : Command line warning D9025 : overriding '/openmp' with '/openmp:experimental' cl : Command line warning D9024 : unrecognized source file type 'Files/Python310/Include', object file assumed coubnfnqsm2gbdzdytufv46jotd6sxsnnhgldiw45pl5yjq5nbvz.cpp C:/Users/user/AppData/Local/Temp/torchinductor_user/ou/coubnfnqsm2gbdzdytufv46jotd6sxsnnhgldiw45pl5yjq5nbvz.cpp(21): fatal error C1083: Cannot open include file: 'Python.h': No such file or directory ``` Python installed in `C:/Program Files/Python310` path, and the blank space break the file path. Solution: Add quotes to declare Windows file paths, after that: ```cmd cl /I "C:/Users/Xuhan/.conda/envs/new_build/Include" /I "C:/Users/Xuhan/.conda/envs/new_build/lib/site-packages/torch/include" /I "C:/Users/Xuhan/.conda/envs/new_build/lib/site-packages/torch/include/torch/csrc/api/include" /D TORCH_INDUCTOR_CPP_WRAPPER /D STANDALONE_TORCH_HEADER /D C10_USING_CUSTOM_GENERATED_MACROS /D CPU_CAPABILITY_AVX512 /DLL /MD /O2 /std:c++20 /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /wd4624 /wd4067 /wd4068 /EHsc /openmp /openmp:experimental C:/Users/Xuhan/AppData/Local/Temp/tmp1wsj0m8r/za/czarp3ly5c22ge3hydvnzvad4cjimyr3hkwvofodxqffgil7frfd.cpp /arch:AVX512 /FeC:/Users/Xuhan/AppData/Local/Temp/tmp1wsj0m8r/za/czarp3ly5c22ge3hydvnzvad4cjimyr3hkwvofodxqffgil7frfd.pyd /LD /link /LIBPATH:"C:/Users/Xuhan/.conda/envs/new_build/libs" /LIBPATH:"C:/Users/Xuhan/.conda/envs/new_build/lib/site-packages/torch/lib" "torch.lib" "torch_cpu.lib" "torch_python.lib" "sleef.lib" ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/149388 Approved by: https://github.com/jansel	2025-03-20 03:10:30 +00:00
Benjamin Glass	e8dd58b8cf	cpp_wrapper: Precompile device-specific header files (#146928 ) This saves us about a second per compilation, which is _massive_ for the OpInfo tests. Total OpInfo test runtime is down about 2x from this change alone. Relands #144002, with changes needed by fbcode internals. Pull Request resolved: https://github.com/pytorch/pytorch/pull/146928 Approved by: https://github.com/desertfire	2025-03-17 20:40:15 +00:00
Shangdi Yu	df60500ab8	Fix too big to optimize in test, actually use O0 when aot_inductor.compile_wrapper_with_O0 is set (#148714 ) Summary: 1. Check against the "0" char instead 2. We got the following error when using anything other than O0 flag: `error: Function ZN5torch12aot_inductorL22__check_inputs_outputsEPP16AtenTensorOpaqueS3 is too big to optimize [-Werror,-Wignored-optimization-argument]` So we use O0 flag in wrapper code when `aot_inductor.compile_wrapper_opt_level` is set to `O0`. Test Plan: ``` buck run 'fbcode//mode/opt' fbcode//deeplearning/aot_inductor/cpu/test:ads_second_stage_dsnn_models_aoti_lowering_test -- -r AdsSecondStageDSNNModelsAOTILoweringTest ``` Differential Revision: D70670957 Pull Request resolved: https://github.com/pytorch/pytorch/pull/148714 Approved by: https://github.com/desertfire	2025-03-13 10:22:06 +00:00
Zhuoran Zhao	3745da18f4	[AOTI] Swith to local cpp compile for fbcode (#148592 ) Summary: as title, otherwise we can not find lamdhip64 Test Plan: https://www.internalfb.com/phabricator/paste/view/P1747104431 Differential Revision: D70637798 Pull Request resolved: https://github.com/pytorch/pytorch/pull/148592 Approved by: https://github.com/hl475	2025-03-08 08:38:26 +00:00
Benjamin Glass	d6d670ab4d	[AOTI] build CPU CPP kernels at O3, and all other code at O1 (#148587 ) In the future, we may also want to add LTO linking to further optimize the results (while still hopefully netting compile time benefits). Differential Revision: [D70641543](https://our.internmc.facebook.com/intern/diff/D70641543) Pull Request resolved: https://github.com/pytorch/pytorch/pull/148587 Approved by: https://github.com/desertfire	2025-03-05 22:47:46 +00:00
Bin Bao	df7e43e5d4	[AOTI] Fix aot_inductor_package test errors (#148279 ) Summary: Fix fbcode test failures introduced by https://github.com/pytorch/pytorch/pull/147975. Make sure script.ld is copied to the build-time directory. Differential Revision: D70454149 Pull Request resolved: https://github.com/pytorch/pytorch/pull/148279 Approved by: https://github.com/zoranzhao	2025-03-05 05:22:48 +00:00
Xuehai Pan	1cb4e2df65	[BE][PYFMT] migrate PYFMT for `torch._inductor` to `ruff format` (#144550 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144550 Approved by: https://github.com/jansel	2025-02-28 13:33:19 +00:00
Bin Bao	f104ef1248	[AOTI][refactor] Consolidate CppBuilder.build and CppBuilder.build_fbcode (#147975 ) Summary: Let CppBuilder handle all the cpp build logic Differential Revision: D70141808 Pull Request resolved: https://github.com/pytorch/pytorch/pull/147975 Approved by: https://github.com/angelayi, https://github.com/yushangdi	2025-02-27 00:35:12 +00:00
PyTorch MergeBot	acca9b9cb0	Revert "[AOTI][refactor] Consolidate CppBuilder.build and CppBuilder.build_fbcode_cpu_re (#147803 )" This reverts commit `0b9da1ae0a`. Reverted https://github.com/pytorch/pytorch/pull/147803 on behalf of https://github.com/wdvr due to breaking internal tests, discussed with author ([comment](https://github.com/pytorch/pytorch/pull/147803#issuecomment-2683938121))	2025-02-26 05:32:17 +00:00
Bin Bao	0b9da1ae0a	[AOTI][refactor] Consolidate CppBuilder.build and CppBuilder.build_fbcode_cpu_re (#147803 ) Summary: Let CppBuilder handle all the cpp build logic Differential Revision: [D70146185](https://our.internmc.facebook.com/intern/diff/D70146185) Pull Request resolved: https://github.com/pytorch/pytorch/pull/147803 Approved by: https://github.com/malfet ghstack dependencies: #147805, #147806, #147807	2025-02-25 13:33:12 +00:00
Bin Bao	cc1c9826d4	[AOTI][refactor] Fix a typo (#147807 ) Summary: defination -> definition Differential Revision: [D70146182](https://our.internmc.facebook.com/intern/diff/D70146182) Pull Request resolved: https://github.com/pytorch/pytorch/pull/147807 Approved by: https://github.com/malfet ghstack dependencies: #147805, #147806	2025-02-25 13:33:12 +00:00
Bin Bao	2680e835c8	[AOTI][refactor] Rename use_absolute_path to use_relative_path (#147805 ) Summary: The option really means to compile a cpp file using its basename instead of the its full path. Reland https://github.com/pytorch/pytorch/pull/147679. Differential Revision: [D70146184](https://our.internmc.facebook.com/intern/diff/D70146184) Pull Request resolved: https://github.com/pytorch/pytorch/pull/147805 Approved by: https://github.com/malfet	2025-02-25 13:32:54 +00:00
PyTorch MergeBot	890213f65f	Revert "[AOTI][refactor] Rename use_absolute_path to use_relative_path (#147679 )" This reverts commit `0b52d801d2`. Reverted https://github.com/pytorch/pytorch/pull/147679 on behalf of https://github.com/desertfire due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/147679#issuecomment-2680389225))	2025-02-25 04:11:13 +00:00
Benjamin Glass	33ff96b3f9	cpp_builder: unbreak clang++ detection (#147775 ) Fixes an issue where `_is_gcc` would match on `clang++` due to the string ending with `g++`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/147775 Approved by: https://github.com/desertfire	2025-02-25 02:33:01 +00:00
Bin Bao	0b52d801d2	[AOTI][refactor] Rename use_absolute_path to use_relative_path (#147679 ) The option really means to compile a cpp file using its basename instead of the its full path. Differential Revision: [D69722709](https://our.internmc.facebook.com/intern/diff/D69722709/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/147679 Approved by: https://github.com/angelayi	2025-02-24 21:44:33 +00:00
xinan.lin	8d618f3da7	[AOTI][XPU] Suppress multi-line comment warning for XPU. (#147710 ) This PR aim to suppress multi-line comment waring in sycl header when building Inductor cpp_wrapper . ``` /intel/oneapi/compiler/2025.0/include/sycl/detail/builtins/builtins.hpp:235:1: warning: multi-line comment [-Wcomment] ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/147710 Approved by: https://github.com/EikanWang, https://github.com/jansel	2025-02-24 07:28:59 +00:00
Bin Bao	d38db94689	[inductor][refactor] Move _compile_file to cpp_builder (#147202 ) Summary: To further conslidate cpp build logic into cpp_builder Test Plan: CI Differential Revision: D69595327 Pull Request resolved: https://github.com/pytorch/pytorch/pull/147202 Approved by: https://github.com/yushangdi	2025-02-14 21:02:30 +00:00
PyTorch MergeBot	2fafcd37c3	Revert "cpp_wrapper: Precompile device-specific header files (#144002 )" This reverts commit `de6efa1feb`. Reverted https://github.com/pytorch/pytorch/pull/144002 on behalf of https://github.com/huydhn due to Sorry for reverting your change but this breaks some inductor tests running internally ([comment](https://github.com/pytorch/pytorch/pull/144002#issuecomment-2649569562))	2025-02-11 00:42:22 +00:00
Benjamin Glass	de6efa1feb	cpp_wrapper: Precompile device-specific header files (#144002 ) This saves us about a second per compilation, which is _massive_ for the OpInfo tests. Total OpInfo test runtime is down about 2x from this change alone. Differential Revision: [D69185685](https://our.internmc.facebook.com/intern/diff/D69185685) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144002 Approved by: https://github.com/desertfire	2025-02-10 17:13:09 +00:00
Henry Tsang	9d5bf38dec	[cpp_builder] refactor to reduce libcudart_static logs (#146394 ) Want to reduce logs from `log_msg = f'"libcudart_static.a" not found under {path}'`, which was added in https://github.com/pytorch/pytorch/pull/142175 Differential Revision: D69096354 Pull Request resolved: https://github.com/pytorch/pytorch/pull/146394 Approved by: https://github.com/benjaminglass1, https://github.com/chenyang78	2025-02-05 00:41:30 +00:00
Bin Bao	16420a78eb	[AOTI] Remove AOTI_USE_CREATE_TENSOR_FROM_BLOB_V1 (#146039 ) Summary: The AOTI_USE_CREATE_TENSOR_FROM_BLOB_V1 macro was used to solve a FC issue and it can be removed now. Test Plan: CI Differential Revision: D68871245 Pull Request resolved: https://github.com/pytorch/pytorch/pull/146039 Approved by: https://github.com/yushangdi, https://github.com/hl475	2025-01-30 19:01:19 +00:00
PyTorch MergeBot	cfbb27462e	Revert "[inductor][BE] Enable test_cpu_cpp_wrapper in fbcode (#145373 )" This reverts commit `b8087747f5`. Reverted https://github.com/pytorch/pytorch/pull/145373 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/145373#issuecomment-2619674197))	2025-01-28 17:46:11 +00:00
H. Vetinari	e6c1e6e20e	simplify torch.utils.cpp_extension.include_paths; use it in cpp_builder (#145480 ) While working on conda-forge integration, I needed to look at the way the include paths are calculated, and noticed an avoidable duplication between `torch/utils/cpp_extension.py` and `torch/_inductor/cpp_builder.py`. The latter already imports the former anyway, so simply reuse the same function. Furthermore, remove long-obsolete include-paths. AFAICT, the `/TH` headers have not existed since pytorch 1.11. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145480 Approved by: https://github.com/ezyang	2025-01-27 07:19:42 +00:00
Bin Bao	b8087747f5	[inductor][BE] Enable test_cpu_cpp_wrapper in fbcode (#145373 ) Differential Revision: D68278174 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145373 Approved by: https://github.com/Skylion007	2025-01-24 17:59:13 +00:00
Irem Yuksel	66bf7da446	Enable sleef for Win Arm64 (#144876 ) Sleef module was disabled for Windows Arm64 on `b021486405` This PR enables it again since the issue is no longer valid. Pull Request resolved: https://github.com/pytorch/pytorch/pull/144876 Approved by: https://github.com/albanD, https://github.com/malfet Co-authored-by: Ozan Aydin <148207261+ozanMSFT@users.noreply.github.com>	2025-01-23 19:22:58 +00:00
Aaron Orenstein	893ca1dfe1	PEP585 update - torch/_inductor/[_-i]* (#145137 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145137 Approved by: https://github.com/bobrenjc93	2025-01-19 01:22:47 +00:00
bobrenjc93	a3ab27b8e0	Migrate from Tuple -> tuple in torch/_inductor (#144264 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144264 Approved by: https://github.com/eellison	2025-01-07 03:27:27 +00:00
Bin Bao	fecf03fa3f	[AOTI][reland] Emit a CMakeLists.txt when package_cpp_only (#143680 ) Summary: Emit a CMakeLists.txt with compile and link options when package_cpp_only is specified. After unzipping AOTI generated .pt2 package file, user can manually build the generated model code in their local environment. Pull Request resolved: https://github.com/pytorch/pytorch/pull/143680 Approved by: https://github.com/huydhn	2024-12-21 03:48:40 +00:00
xinan.lin	b5e159270a	[AOTI XPU] Replace intel compiler with g++ to build inductor CPP wrapper in runtime. (#142322 ) This PR aims to removes the de pendency on Intel Compiler at Inductor runtime. Now we only need a SYCL_HOME in runtime to find the sycl headers and libs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/142322 Approved by: https://github.com/EikanWang, https://github.com/desertfire, https://github.com/albanD ghstack dependencies: #143491	2024-12-21 02:27:04 +00:00
Tom Ritchford	b5475d334e	[inductor] Fix an unused variable in cpu_vec_isa.py (#138473 ) ---- * Extracted from https://github.com/pytorch/pytorch/pull/133492 Pull Request resolved: https://github.com/pytorch/pytorch/pull/138473 Approved by: https://github.com/EikanWang, https://github.com/albanD, https://github.com/xuhancn	2024-12-20 18:50:19 +00:00
Huamin Li	f5af87c23c	Make Inductor cpp backend enable_floating_point_contract_flag to take string (#143450 ) Differential Revision: D66269001 Pull Request resolved: https://github.com/pytorch/pytorch/pull/143450 Approved by: https://github.com/desertfire	2024-12-20 16:28:54 +00:00
PyTorch MergeBot	71479a9b9c	Revert "[AOTI] Emit a CMakeLists.txt when package_cpp_only (#143352 )" This reverts commit `429f4cd140`. Reverted https://github.com/pytorch/pytorch/pull/143352 on behalf of https://github.com/huydhn due to Sorry for reverting your change but the new test is failing on ROCm ([comment](https://github.com/pytorch/pytorch/pull/143352#issuecomment-2556365140))	2024-12-20 06:21:31 +00:00
Bin Bao	429f4cd140	[AOTI] Emit a CMakeLists.txt when package_cpp_only (#143352 ) Summary: Emit a CMakeLists.txt with compile and link options when package_cpp_only is specified. After unzipping AOTI generated .pt2 package file, user can manually build the generated model code in their local environment. Differential Revision: [D67458526](https://our.internmc.facebook.com/intern/diff/D67458526) Pull Request resolved: https://github.com/pytorch/pytorch/pull/143352 Approved by: https://github.com/malfet	2024-12-19 22:01:05 +00:00
Bin Bao	0e8013fc1c	[AOTI] Fix a typo in cpp_builder.py (#143351 ) Summary: passthough -> passthrough Pull Request resolved: https://github.com/pytorch/pytorch/pull/143351 Approved by: https://github.com/yushangdi, https://github.com/chenyang78 ghstack dependencies: #143350	2024-12-18 16:28:37 +00:00
Benjamin Glass	bb06fc79fb	cpp_builder: handle CUDA lib paths involving "stubs" in more circumstances (#142175 ) conda packages for `cuda-driver-dev=12.4.127` use a "stubs" subdirectory to contain `libcuda.so`. This was previously only handled by cpp_builder in some cases, but now needs to be potentially handled more generally. Pull Request resolved: https://github.com/pytorch/pytorch/pull/142175 Approved by: https://github.com/desertfire	2024-12-17 17:21:27 +00:00
Tom Ritchford	dc23f1944a	Remove unused Python variables in torch/[_-a]* (#133492 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133492 Approved by: https://github.com/albanD	2024-12-12 17:39:14 +00:00
Colin L. Rice	d68403df3b	filelock: Make waitcounter variant to use (#139816 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/139816 Approved by: https://github.com/ezyang	2024-12-12 01:18:34 +00:00
PyTorch MergeBot	5c97ac9721	Revert "Remove unused Python variables in torch/[_-a]* (#133492 )" This reverts commit `fda975a7b3`. Reverted https://github.com/pytorch/pytorch/pull/133492 on behalf of https://github.com/clee2000 due to Sorry, I need to revert this in order to revert something else. The only thing you need to do is rebase and remerge ([comment](https://github.com/pytorch/pytorch/pull/133492#issuecomment-2536635516))	2024-12-11 17:29:12 +00:00
PyTorch MergeBot	2374d460d0	Revert "filelock: Make waitcounter variant to use (#139816 )" This reverts commit `237c4b559c`. Reverted https://github.com/pytorch/pytorch/pull/139816 on behalf of https://github.com/clee2000 due to Sorry, I need to revert this in order to revert something else. The only thing you need to do is rebase and remerge ([comment](https://github.com/pytorch/pytorch/pull/139816#issuecomment-2536616808))	2024-12-11 17:26:46 +00:00
Colin L. Rice	237c4b559c	filelock: Make waitcounter variant to use (#139816 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/139816 Approved by: https://github.com/ezyang	2024-12-10 23:02:59 +00:00
Tom Ritchford	fda975a7b3	Remove unused Python variables in torch/[_-a]* (#133492 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133492 Approved by: https://github.com/albanD	2024-12-10 21:48:44 +00:00
Colin Peppler	0602676c8d	[CUTLASS][AOTI] Fixes undefined symbol: cudaLaunchKernelExC (#142094 ) Summary: ### Context * When compiling the object file for a CUTLASS kernel, CUDA RT symbols are left undefined. * When compiling the final shared object file, we statically link with `libcudart_static.a`. * One important thing is that ordering matters when specifying the lib search paths (-L). Test Plan: ``` // before diff RuntimeError: Failure loading .so: /tmp/tmpqhz_dnza/model.so: undefined symbol: cudaLaunchKernelExC ``` Differential Revision: D66793974 Pull Request resolved: https://github.com/pytorch/pytorch/pull/142094 Approved by: https://github.com/chenyang78, https://github.com/hl475	2024-12-06 02:18:54 +00:00
xinan.lin	4742080ed9	[AOTI XPU] Enable Cpp wraper for Intel GPU. (#135318 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/135318 Approved by: https://github.com/jgong5, https://github.com/EikanWang, https://github.com/guangyey, https://github.com/desertfire	2024-11-26 11:51:32 +00:00

1 2 3

111 Commits