pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Michal Gallus	5bbca7d328	[ROCm][Windows] Fix OpenMP Flags for clang-cl (#148097 ) When clang-cl parses its command line arguments, it expects MSVC-style arguments (beggining with `/` such as `/WX`, `/MD`, etc.) to be provided, and clang-style arguments to be preceded by `-Xclang`, otherwise, the clang-style parameters are ignored as they are interpreted unrecognized compiler options. Pull Request resolved: https://github.com/pytorch/pytorch/pull/148097 Approved by: https://github.com/jeffdaily	2025-03-10 22:47:15 +00:00
Fadi Arafeh	d1f21d8ec3	Enable Direct Use of Arm Compute Library (ACL) in ATen (#148584 ) ACL is already built with PyTorch as a shared library when USE_MKLDNN_ACL is set. Currently, it is only used indirectly in ATen via oneDNN for AArch64 targets. However there are cases where it makes sense to utilize ACL directly without oneDNN as an intermediary - e.g. quantization. See #145942, #147337, #146620. This patch enables such use cases by exposing ACL to ATen Pull Request resolved: https://github.com/pytorch/pytorch/pull/148584 Approved by: https://github.com/malfet	2025-03-10 18:29:51 +00:00
ZhiweiYan-96	4075646bd8	Use oneDNN v3.7.1 for Intel GPU (#148403 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/148403 Approved by: https://github.com/EikanWang Co-authored-by: majing <jing1.ma@intel.com> Co-authored-by: xiaolil1 <xiaoli.liu@intel.com>	2025-03-07 08:03:49 +00:00
ZhiweiYan-96	af720cd5a7	[Intel GPU] Decompule Intel GPU oneDNN from other backends (#147926 ) # Motivation Currently, Intel GPU is moving forward rapidly with the development of feature. We(Intel GPU) want an independent version control over oneDNN component so as to quickly adopt the optimization or bug fixing provided by oneDNN team. This PR does not change the behaviors of other backends like Intel CPU, ARM. They can keep using the stable version contained in `third_party/ideep`. # Detail At compilation time, we will `git clone` oneDNN via URL `https://github.com/oneapi-src/oneDNN` and checkout to the tag/commit that Intel GPU backend prefers. This feature is supported by CMake `Externalproject_add` command. Following is a build log example: ```bash [11/60] Performing download step (git clone) for 'xpu_mkldnn_proj' Cloning into 'xpu_mkldnn_proj'... HEAD is now at 5e92240360 meta: updated citation file [12/60] Performing update step for 'xpu_mkldnn_proj' -- Already at requested tag: v3.7 [13/60] No patch step for 'xpu_mkldnn_proj' ``` The log demonstates that, we explicitly download the source files and checkout to a specific tag. The source file of oneDNN is located at `build/xpu_mkldnn_proj-prefix/src/xpu_mkldnn_proj` # Runtime verification Running UT for CPU ```bash onednn_verbose,v1,info,oneDNN v3.7.0 (commit fc3f17ad469b8a6da7192ae12d32625faa509f1e) onednn_verbose,v1,info,cpu,runtime:OpenMP,nthr:24 onednn_verbose,v1,info,cpu,isa:Intel AVX-512 with Intel DL Boost onednn_verbose,v1,info,gpu,runtime:none onednn_verbose,v1,info,graph,backend,0:dnnl_backend onednn_verbose,v1,primitive,info,template:operation,engine ``` Runnint UT for Intel GPU ```bash onednn_verbose,v1,info,oneDNN v3.7.0 (commit 5e9224036021433d2577548ed0539fe9a53256bc) onednn_verbose,v1,info,cpu,runtime:threadpool,nthr:24 onednn_verbose,v1,info,cpu,isa:Intel AVX-512 with Intel DL Boost onednn_verbose,v1,info,gpu,runtime:DPC++ onednn_verbose,v1,info,gpu,engine,sycl gpu device count:2 ``` We can see that, Intel GPU would uses commit `5e922` (tag v3.7), while CPU uses `fc3f17` Pull Request resolved: https://github.com/pytorch/pytorch/pull/147926 Approved by: https://github.com/EikanWang Co-authored-by: leizhenyuan <zhenyuan.lei@intel.com>	2025-02-28 07:42:06 +00:00
Wang, Eikan	2c35af4def	[Intel GPU] Avoid including CPU oneDNN header files for Intel GPU (#147969 ) XPU builds oneDNN in another folder. The XPU oneDNN head files are in the XPU-specific folder - `${__XPU_MKLDNN_BUILD_DIR}`. `f522d899fb/cmake/Modules/FindMKLDNN.cmake (L73)` So, `${PROJECT_SOURCE_DIR}/third_party/ideep/mkl-dnn/include` is useless for XPU. `XPU_MKLDNN_INCLUDE` is good enough. Meanwhile, it may mess up the included files if the version of XPU oneDNN differs from other backends. * __->__ #147969 Pull Request resolved: https://github.com/pytorch/pytorch/pull/147969 Approved by: https://github.com/ZhiweiYan-96, https://github.com/liangan1, https://github.com/atalman	2025-02-27 14:22:17 +00:00
Ding, Yi1	af1072ffb6	[Intel GPU] Enable BUILD_GRAPH for xpu_mkldnn (#147608 ) For preparation of OneDNN based XPU SDPA enabling. Pull Request resolved: https://github.com/pytorch/pytorch/pull/147608 Approved by: https://github.com/EikanWang, https://github.com/atalman	2025-02-21 16:12:30 +00:00
Nikita Shulga	0d5f0a81c5	[CMake] Find HomeBrew OpenMP on MacOS (#145870 ) Either via `OMP_PREFIX` envvar or by searching in `/opt/homebrew/opt/libomp` folder Modify libomp bundling logic in setup.py to change absolute path to libomp.dylib to a relative one if necessary Pull Request resolved: https://github.com/pytorch/pytorch/pull/145870 Approved by: https://github.com/Skylion007, https://github.com/atalman ghstack dependencies: #145871	2025-01-30 03:19:51 +00:00
PyTorch MergeBot	b80482988f	Revert "[CMake] Find HomeBrew OpenMP on MacOS (#145870 )" This reverts commit `c26bb9ba5b`. Reverted https://github.com/pytorch/pytorch/pull/145870 on behalf of https://github.com/malfet due to Want to refine it a bit ([comment](https://github.com/pytorch/pytorch/pull/145870#issuecomment-2622659614))	2025-01-29 19:34:27 +00:00
Nikita Shulga	c26bb9ba5b	[CMake] Find HomeBrew OpenMP on MacOS (#145870 ) Either via `OMP_PREFIX` envvar or just searching in that folder Pull Request resolved: https://github.com/pytorch/pytorch/pull/145870 Approved by: https://github.com/Skylion007	2025-01-28 23:09:37 +00:00
Nikita Shulga	8d91bfd965	[BE] Include CheckFunctionExists in `FindBLAS.cmake` (#145849 ) It's used in the script, so it must be included Pull Request resolved: https://github.com/pytorch/pytorch/pull/145849 Approved by: https://github.com/Skylion007	2025-01-28 19:47:05 +00:00
Stefan-Alin Pahontu	0674ab7e33	solve apl dependency issue (#145215 ) According to the [APL documentation](https://developer.arm.com/documentation/101004/2404/General-information/Arm-Performance-Libraries-example-programs), libraries ending with _mp are OpenMP multi-threaded libraries. When a project is compiled with MSVC and the -openmp flag, the vcomp library (Visual C++ implementation of OpenMP) is used for runtime calls. However, the current APL implementation uses the libomp.dll (LLVM) variant. As a result, there are unexpected behaviors at runtime. --- For Example: ```python import torch # Create a sparse tensor # Input (Sparse Tensor): # [[0, 1], # [1, 0]] indices = torch.tensor([[0, 1], [1, 0]]) values = torch.tensor([1, 1], dtype=torch.float32) size = torch.Size([2, 2]) sparse_tensor = torch.sparse_coo_tensor(indices, values, size) # Convert sparse tensor to dense tensor dense_tensor = sparse_tensor.to_dense() # Expected Output (Dense Tensor): # [[0, 1], # [1, 0]] print("\nDense Tensor:") print(dense_tensor) ``` However, it prints unexpected outputs such as: ```python # [[0, 11], # [10, 0]] ``` The issue arises because the following code does not function as expected at runtime: https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/ParallelOpenMP.h#L30 ```c++ // returns 1 , however since OpenMP is enabled it should return total number of threads int64_t num_threads = omp_get_num_threads(); ``` --- In the runtime, loading multiple OpenMP libraries (in this case `libomp` and `vcomp`) is causing unexpected behaviours. So, we've changed libraries from `_mp` to non `_mp` versions and we used `vcomp` for OpenMP calls. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145215 Approved by: https://github.com/ozanMSFT, https://github.com/malfet Co-authored-by: Ozan Aydin <148207261+ozanMSFT@users.noreply.github.com>	2025-01-27 13:02:16 +00:00
Yu, Guangye	891ba2ec8a	Fix xpu cmake typo (#140374 ) # Motivation This PR aims to fix a typo in the CMake build. The typo impacts the XPU Windows build and results in PyTorch being built without XPU, which is unexpected. Pull Request resolved: https://github.com/pytorch/pytorch/pull/140374 Approved by: https://github.com/EikanWang, https://github.com/ezyang, https://github.com/atalman	2024-11-13 00:26:35 +00:00
Yu, Guangye	8051ee802c	Add XPU compiler version control in cmake to keep BC (#139258 ) # Motivation This PR aims to maintain backward compatibility when building PyTorch XPU with the old and new compilers. # Additional Context The details are described here. The new compiler (2025.0.0) has some breaking changes compared with the old compiler(2024.1), for examples: 1. On Windows, sycl library is named `sycl7.lib` in the old compiler but is named `sycl.lib` in the new compiler. 2. On Linux, in order to support ABI=0, we have to link `libsycl-preview.so` in the old compiler but we could link `libsycl.so` in the new compiler to have the same ABI compatibility. 3. We added a macro `SYCL_COMPILER_VERSION` to support our new code has good backward compatibility with the old compiler. Now the new feature(Event elapsed_time, memory summary, and device architecture property) introduced by the new compiler will be controlled within the macro `SYCL_COMPILER_VERSION`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/139258 Approved by: https://github.com/EikanWang, https://github.com/atalman, https://github.com/gujinghui	2024-11-09 13:31:21 +00:00
Irem Yuksel	b021486405	Enable Windows Arm64 (#133088 ) This PR enables Pytorch for Windows on Arm64 - CPU only. Currently, there aren't any checks in place to build and test for Windows on Arm64, but we're working to implement those as soon as possible. We recommend using [Arm Performance Libraries (APL)](https://developer.arm.com/Tools%20and%20Software/Arm%20Performance%20Libraries) as a BLAS option, which is introduced in this PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/133088 Approved by: https://github.com/malfet Co-authored-by: cristian panaite <panaite.cristian2000@gmail.com> Co-authored-by: Stefan-Alin Pahontu <56953855+alinpahontu2912@users.noreply.github.com> Co-authored-by: Ozan Aydin <148207261+ozanMSFT@users.noreply.github.com>	2024-10-24 16:10:44 +00:00
maajidkhann	5a6ddbcc3b	Extending the Pytorch vec backend for SVE (ARM) (#119571 ) Motivation: In Pytorch, Aten vectorization supports multiple platforms, including x86 and Arm, as well as multiple data types. It provides a generic implementation of Vector (Vec) type that allows the programmer to write code packing various primitives (such as floats) within 256bit & 512bits registers. It can be extended to support other ISAs easily by adding more VecISA sub-classes. Reference Link: https://github.com/pytorch/pytorch/tree/main/aten/src/ATen/cpu/vec This PR: * Our goal with this contribution is to add support for SVE backend for Vec in the Aten vectorization for CPU backend which can be benefitted by any ARM architecture supported CPU's that supports SVE. * More about SVE ISA for ARM: [https://developer.arm.com/Architectures/Scalable Vector Extensions](https://developer.arm.com/Architectures/Scalable%20Vector%20Extensions) * We are using the ARM C Language Extensions for SVE (https://developer.arm.com/documentation/102699/0100/Optimizing-with-intrinsics ) to accelerate performance for various operators in the SVE backend for Vec. * Currently we are adding support only for SVE ISA with the vector length of 256 bits (SVE 256). In future, we plan to extend this SVE support for other vector lengths as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119571 Approved by: https://github.com/malfet, https://github.com/snadampal Co-authored-by: Divya Kotadiya <divya.kotadiya@fujitsu.com>	2024-09-18 18:59:10 +00:00
Dmitry Rogozhkin	9852c6d236	xpu: fix 3rd party builds on systems with cmake<3.25 (#135767 ) Cmake LINUX variable is available on starting from cmake 3.25. Better to use CMAKE_SYSTEM_NAME instead to relax cmake version requirement. See: https://cmake.org/cmake/help/v3.25/variable/LINUX.html Fixes: #135766 Pull Request resolved: https://github.com/pytorch/pytorch/pull/135767 Approved by: https://github.com/malfet, https://github.com/guangyey	2024-09-12 05:31:01 +00:00
CaoE	f7c0c06692	Add oneDNN BRGEMM support on CPU (#131878 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/131878 Approved by: https://github.com/jgong5, https://github.com/peterbell10	2024-09-07 13:22:30 +00:00
min-jean-cho	ecbd715363	[Intel GPU][Windows] Fix overriding default CMAKE_CXX_FLAGS (#135093 ) The root cause is that `/EHsc` is part of the default `CMAKE_CXX_FLAGS` in CMake. Fix to not override the default `CMAKE_CXX_FLAGS`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/135093 Approved by: https://github.com/EikanWang, https://github.com/atalman	2024-09-05 12:52:43 +00:00
Edward Z. Yang	a258844a32	Properly handle empty CPUINFO variable (#134916 ) Fixes https://github.com/pytorch/pytorch/issues/134915 But I did not root cause why CPUINFO is totally empty to begin with... Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/134916 Approved by: https://github.com/Skylion007	2024-09-03 15:59:59 +00:00
Yu, Guangye	3402a5d865	fix windows xpu build issue (#133845 ) # Motivation If build XPU via oneAPI 2024.2, it will fail because `sycl-preview.lib` exists in windows. And linking the unexpected lib results in `error LNK2019: unresolved external symbol`. # Solution Use explicitly `sycl-preview` in linux build only. # Additional Context For `find_library`, please note that the variable will not be updated if it has been stored. ``` If the library is found the result is stored in the variable and the search will not be repeated unless the variable is cleared. ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/133845 Approved by: https://github.com/min-jean-cho, https://github.com/EikanWang, https://github.com/atalman, https://github.com/malfet	2024-08-29 23:53:32 +00:00
Zitong Zhan	90c821814e	SparseCsrCUDA: cuDSS backend for linalg.solve (#129856 ) This PR switches to cuDSS library and has the same purpose of #127692, which is to add Sparse CSR tensor support to linalg.solve. Fixes #69538 Minimum example of usage: ``` import torch if __name__ == '__main__': spd = torch.rand(4, 3) A = spd.T @ spd b = torch.rand(3).to(torch.float64).cuda() A = A.to_sparse_csr().to(torch.float64).cuda() x = torch.linalg.solve(A, b) print((A @ x - b).norm()) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129856 Approved by: https://github.com/amjames, https://github.com/lezcano, https://github.com/huydhn Co-authored-by: Zihang Fang <zhfang1108@gmail.com> Co-authored-by: Huy Do <huydhn@gmail.com>	2024-08-22 07:57:30 +00:00
Mikayla Gawarecki	018e48c337	[Reland] Add wrappers for synchronous GPUDirect Storage APIs (#133489 ) Reland #130633 USE_CUFILE turned off by default in this version Pull Request resolved: https://github.com/pytorch/pytorch/pull/133489 Approved by: https://github.com/albanD	2024-08-15 17:11:52 +00:00
Yu, Guangye	92bebb46fa	Support XPU ABI=0 build (#130110 ) # Motivation This PR intends to support ABI=0 build for XPU backend. # Additional Context The major change is adding a compilation option `-D__INTEL_PREVIEW_BREAKING_CHANGES` for the host compiler(gcc) and `-fpreview-breaking-changes` for XPU device kernel code compiler(icpx), why? Because we use - gcc to compile host code and link SYCL runtime. So we need to pass `-D__INTEL_PREVIEW_BREAKING_CHANGES` to tell the host compiler invoking the ABI-neutral API included in SYCL. And - use icpx to compile device kernel code and link SYCL runtime. So we need to pass `-fpreview-breaking-changes` to tell the device kernel compiler building ABI-neutral code. Besides, - `libsycl-preview.so` is an ABI-neutral library but `libsycl.so` is not. This PR depends on https://github.com/pytorch/pytorch/pull/131643. Pull Request resolved: https://github.com/pytorch/pytorch/pull/130110 Approved by: https://github.com/EikanWang, https://github.com/gujinghui, https://github.com/albanD	2024-08-01 21:42:14 +00:00
PyTorch MergeBot	e191b83462	Revert "Add wrappers for synchronous GPUDirect Storage APIs (#130633 )" This reverts commit `709ddf7a9d`. Reverted https://github.com/pytorch/pytorch/pull/130633 on behalf of https://github.com/clee2000 due to still failing internally D60265673 ([comment](https://github.com/pytorch/pytorch/pull/130633#issuecomment-2253239607))	2024-07-26 18:08:20 +00:00
Mikayla Gawarecki	709ddf7a9d	Add wrappers for synchronous GPUDirect Storage APIs (#130633 ) Based in part on https://github.com/NVIDIA/apex/pull/1774 Differential Revision: [D60155434](https://our.internmc.facebook.com/intern/diff/D60155434) Pull Request resolved: https://github.com/pytorch/pytorch/pull/130633 Approved by: https://github.com/albanD	2024-07-25 22:23:38 +00:00
PyTorch MergeBot	e4b5645f83	Revert "Add wrappers for synchronous GPUDirect Storage APIs (#130633 )" This reverts commit `5b5e0698a5`. Reverted https://github.com/pytorch/pytorch/pull/130633 on behalf of https://github.com/clee2000 due to breaking a lot of jobs and build rules internally D60085885, possibly needs to update some bazel build? ([comment](https://github.com/pytorch/pytorch/pull/130633#issuecomment-2245806738))	2024-07-23 17:19:34 +00:00
Mikayla Gawarecki	5b5e0698a5	Add wrappers for synchronous GPUDirect Storage APIs (#130633 ) Based in part on https://github.com/NVIDIA/apex/pull/1774 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130633 Approved by: https://github.com/albanD	2024-07-22 14:51:24 +00:00
Xu Han	f1456c74a0	Fix mkl-static issue for Windows. (#130697 ) Background: We found the pytorch Windows release/2.4 performance regression: https://github.com/pytorch/pytorch/issues/130619 After some debug works, I found the pytorch Windows static mkl build options are wrong: <img width="1049" alt="image" src="https://github.com/user-attachments/assets/38692142-bfca-4c98-8092-6e105c82bb13"> 1. Thread lib is wrong. 2. Miss `openmp` lib and config. > Debug history: https://github.com/pytorch/pytorch/issues/130619#issuecomment-2226782504 and https://github.com/pytorch/pytorch/issues/130619#issuecomment-2226418611 This PR will fix `mkl-static` build options issue. <img width="863" alt="image" src="https://github.com/user-attachments/assets/834f6cee-7e6d-4d74-b2bc-8a270f05e429"> Reference: <img width="482" alt="image" src="https://github.com/user-attachments/assets/8184dadb-f230-4062-a49f-51df1d7285f5"> https://www.intel.com/content/www/us/en/developer/tools/oneapi/onemkl-link-line-advisor.html#gs.c6izlg Pull Request resolved: https://github.com/pytorch/pytorch/pull/130697 Approved by: https://github.com/jgong5, https://github.com/atalman	2024-07-15 19:28:11 +00:00
Nikita Shulga	fe4032fe20	[BE][CMake] Do not use `EXEC_PROGRAM` (#129714 ) It was deprecated since CMake-3.0 in favor of `execute_process`, see https://cmake.org/cmake/help/v3.18/command/exec_program.html This makes the following warning disappear: ``` CMake Warning (dev) at cmake/Modules/FindARM.cmake:5 (EXEC_PROGRAM): Policy CMP0153 is not set: The exec_program command should not be called. Run "cmake --help-policy CMP0153" for policy details. Use the cmake_policy command to set the policy and suppress this warning. Use execute_process() instead. ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129714 Approved by: https://github.com/kit1980	2024-06-28 13:29:52 +00:00
Nikita Shulga	4b598d87d3	Fix FindBLAS.cmake (#129713 ) Fixes regression introduced by https://github.com/pytorch/pytorch/pull/125227 by adding `INCLUDE(CheckFunctionExists)` that fixes ``` CMake Error at cmake/Modules/FindBLAS.cmake:413 (check_function_exists): Unknown CMake command "check_function_exists". ``` Fixes https://github.com/pytorch/pytorch/issues/129693 Pull Request resolved: https://github.com/pytorch/pytorch/pull/129713 Approved by: https://github.com/kit1980	2024-06-28 02:15:16 +00:00
vinithakv	f8db12a538	Fix logic to find sbgemm in BLAS library (#125227 ) Current logic to set the HAS_SBGEMM flag is ignored in case the BLAS libraries are found already, ie, if set from environment variable BLAS=OpenBLAS . If BLAS_LIBRARIES are already set the code to find if BLAS_LIBRARY has sbgemm is never executed. The following commit brings out this logic outside unconditionally. Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/125227 Approved by: https://github.com/malfet	2024-06-25 16:34:38 +00:00
sdp	b4a0161449	Build SYCL kernels for ATen XPU ops on Native Windows (take 2) (#127390 ) Original PR https://github.com/pytorch/pytorch/pull/126725 is closed due to bad rebase. ------- As proposed in https://github.com/pytorch/pytorch/issues/126719, we are enabling PyTorch XPU on Native Windows on Intel GPU. This PR enables XPU build on Windows as the first step of #126719: - Enable `USE_XPU` build on Windows using MSVC as host compiler. The use of MSVC as host compiler seamlessly aligns with the existing PyTorch build on Windows. - Build oneDNN GPU library on Windows. Co-authored-by: Yu, Guangye <guangye.yu@intel.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/127390 Approved by: https://github.com/guangyey, https://github.com/EikanWang, https://github.com/gujinghui, https://github.com/ezyang	2024-06-06 01:41:06 +00:00
cyy	3d617333e7	Simplify CMake code (#127683 ) Due to the recent adoption of find(python), it is possible to further simplify some CMake code. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127683 Approved by: https://github.com/ezyang	2024-06-05 15:17:31 +00:00
cyy	d44daebdbc	[Submodule] Remove deprecated USE_TBB option and TBB submodule (#127051 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/127051 Approved by: https://github.com/cpuhrsch, https://github.com/malfet	2024-05-31 01:20:45 +00:00
cyy	8777443d73	Remove FindMatlabMex.cmake (#127414 ) It is not used anymore. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127414 Approved by: https://github.com/ezyang	2024-05-30 16:26:35 +00:00
Dmitry Rogozhkin	9f73c65b8f	xpu: pass MAX_JOBS building xpu_mkldnn_proj (#126562 ) mkldnn is quite big project and MAX_JOBS support is essential when building on a system with big number of cpus and limited memory. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126562 Approved by: https://github.com/jgong5, https://github.com/guangyey, https://github.com/albanD	2024-05-30 12:10:33 +00:00
PyTorch MergeBot	67739d8c6f	Revert "[Submodule] Remove deprecated USE_TBB option and TBB submodule (#127051 )" This reverts commit `699db7988d`. Reverted https://github.com/pytorch/pytorch/pull/127051 on behalf of https://github.com/PaliC due to This PR needs to be synced using the import button as there is a bug in our diff train ([comment](https://github.com/pytorch/pytorch/pull/127051#issuecomment-2138496995))	2024-05-30 01:16:57 +00:00
cyy	8ea1dc8748	Use Python::NumPy target (#127399 ) Now that we use FindPython, use it again for numpy detection. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127399 Approved by: https://github.com/malfet	2024-05-29 23:17:58 +00:00
Nikita Shulga	0910429d72	[BE][CMake] Use FindPython module (#124613 ) As FindPythonInterp and FindPythonLibs has been deprecated since cmake-3.12 Replace `PYTHON_EXECUTABLE` with `Python_EXECUTABLE` everywhere (CMake variable names are case-sensitive) This makes PyTorch buildable with python3 binary shipped with XCode on MacOS TODO: Get rid of `FindNumpy` as its part of Python package Pull Request resolved: https://github.com/pytorch/pytorch/pull/124613 Approved by: https://github.com/cyyever, https://github.com/Skylion007	2024-05-29 13:17:35 +00:00
cyy	699db7988d	[Submodule] Remove deprecated USE_TBB option and TBB submodule (#127051 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/127051 Approved by: https://github.com/cpuhrsch, https://github.com/malfet	2024-05-29 11:58:03 +00:00
PyTorch MergeBot	cdbb2c9acc	Revert "[Submodule] Remove deprecated USE_TBB option and TBB submodule (#127051 )" This reverts commit `4fdbaa794f`. Reverted https://github.com/pytorch/pytorch/pull/127051 on behalf of https://github.com/PaliC due to This PR needs to be synced using the import button as there is a bug in our diff train ([comment](https://github.com/pytorch/pytorch/pull/127051#issuecomment-2136428735))	2024-05-29 03:02:35 +00:00
cyy	4fdbaa794f	[Submodule] Remove deprecated USE_TBB option and TBB submodule (#127051 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/127051 Approved by: https://github.com/cpuhrsch, https://github.com/malfet	2024-05-27 03:54:03 +00:00
Bas Zalmstra	a8eac0efa8	fix: unknown CMake command "check_function_exists" (#126165 ) When building pytorch with OpenBLAS on windows I ran into this CMake issue: ``` CMake Error at cmake/Modules/FindLAPACK.cmake:137 (check_function_exists): Unknown CMake command "check_function_exists". Call Stack (most recent call first): cmake/Dependencies.cmake:1745 (find_package) CMakeLists.txt:708 (include) ``` Similarly described here: https://discuss.pytorch.org/t/cmake-with-error-by-compiling-on-windows-with-mingw32-make/159140 This PR fixes this issue by adding: ``` include(CheckFunctionExists) ``` To the offending CMake file. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126165 Approved by: https://github.com/ezyang	2024-05-14 20:54:06 +00:00
cyy	83845a7c78	[1/2] Remove caffe2 db and distributed from build system (#125092 ) This PR tries to decompose https://github.com/pytorch/pytorch/pull/122527 into a smaller one. Caffe2 db, distributed and some binaries have been removed. To be noted, this was inspired and is co-dev with @r-barnes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/125092 Approved by: https://github.com/malfet	2024-05-04 06:48:46 +00:00
Aleksei Nikiforov	2b5ae2611e	s390x: use runtime detection for vectorization support (#123936 ) s390x: use runtime detection for vectorization support Pull Request resolved: https://github.com/pytorch/pytorch/pull/123936 Approved by: https://github.com/malfet, https://github.com/jansel, https://github.com/xuhancn	2024-05-03 21:34:37 +00:00
aaitzhan	e3627d05e7	[CMake] Add NVPL BLAS/LAPACK option (#125268 ) This PR add a [NVPL](https://docs.nvidia.com/nvpl/introduction.html) BLAS/LAPACK option to CMake for `aarch64` (ARM) machines. Pull Request resolved: https://github.com/pytorch/pytorch/pull/125268 Approved by: https://github.com/albanD	2024-05-01 17:26:28 +00:00
cyy	04c6424fbf	Remove caffe2 image and video (#125045 ) This PR tries to decompose https://github.com/pytorch/pytorch/pull/122527 into a smaller one. Caffe2 image and video folders are removed along with the related CMake code. To be noted, this was inspired and is co-dev with @r-barnes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/125045 Approved by: https://github.com/eqy, https://github.com/albanD	2024-04-30 17:31:57 +00:00
Xu Han	44bb5da529	Fix mkl cmake not support static mkl on Windows. (#124925 ) Fixes #124869 Fix mkl not support static library on Windows. # Local test: ## MKL static: ![image](https://github.com/pytorch/pytorch/assets/8433590/9c6ee5f8-9844-4383-acbd-6b22aff06daa) MKL backend check: <img width="724" alt="Image" src="https://github.com/pytorch/pytorch/assets/8433590/e45e12a5-2dfc-47a1-ad94-32a667bd4799"> ## MKL shared, original path: ![image](https://github.com/pytorch/pytorch/assets/8433590/27a822c7-c4ab-4e5f-bbdb-8c4b085140e5) Pull Request resolved: https://github.com/pytorch/pytorch/pull/124925 Approved by: https://github.com/jgong5, https://github.com/ezyang	2024-04-25 14:21:15 +00:00
Chirag Pandya	fd90991790	[rfc] opentelemetry in pytorch (#122999 ) 1. Add current latest version (opentelemetry-cpp version v1.14.2) to PyTorch library. Steps: ``` $cd pytorch $git submodule add https://github.com/open-telemetry/opentelemetry-cpp.git third_party/opentelemetry-cpp $cd third_party/opentelemetry-cpp $git checkout v1.14.2 $git add third_party/opentelemetry-cpp .gitmodules $git commit ``` Expected change in checkout size: ``` (/home/cpio/local/a/pytorch-env) [cpio@devvm17556.vll0 ~/local/pytorch (gh/c-p-i-o/otel)]$ git count-objects -vH count: 654 size: 3.59 MiB in-pack: 1229701 packs: 17 size-pack: 1.17 GiB prune-packable: 76 garbage: 0 size-garbage: 0 bytes ``` 2. TODO - [x] Figure out how dynamic linking works. App builders will somehow need to `target_include` opentelemetry-cpp at runtime. - [ ] Examples on how to use opentelemetry + pytorch - [ ] Tests + documentation (e.g. using null opentelemetry implementation). Pull Request resolved: https://github.com/pytorch/pytorch/pull/122999 Approved by: https://github.com/ezyang	2024-04-21 15:20:21 +00:00
ZhiweiYan-96	9875a834e4	[Intel GPU] oneDNN GPU GEMM support (#117202 ) # Motivation This PR is a part of RFC #114848, and it is a successor PR of #116249 and #116019. This PR would depend on oneDNN compilation in #116249. Some runtime support is needed in #116019. Aten operators like `addmm`, `baddmm` is defined in `Blas.cpp` in `aten/src/ATen/native/mkldnn/xpu/`. Accompanied with these files provide core functionaliy, `BlasImpl.h`, `Utils.h` and other file provide basic utilities for them. For instance, `Utils.h` provide common memory descriptor query utils for `Matmul.h` and these utility function will also be used in other primitive, like `convolution`. `BlasImpl.h` is a header file that provide helper for handling shape info processing in matmul related operators. It would not only help basic GEMM operator like `addmm, baddmm` but also help fusion operators used in `torch.compile` like `linear_pointwise` in #117824. In next stage, we would continually complete the oneDNN support through enabling `matmul fusion` and `convolution` related code. Co-authored-by: xiaolil1 <xiaoli.liu@intel.com> Co-authored-by: lei,zhenyuan <zhenyuan.lei@intel.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/117202 Approved by: https://github.com/EikanWang, https://github.com/jgong5, https://github.com/malfet ghstack dependencies: #117098, #117112	2024-04-17 23:06:38 +00:00

1 2 3 4 5

216 Commits