Commit Graph

141 Commits

Author SHA1 Message Date
Xiao Wang
976ff5cf01 Add cmake hints to USE_SYSTEM_NVTX for nvtx3 include dir (#147418)
per title

sometimes, it's hard for cmake to find NVTX3 without the cuda include path hint
Pull Request resolved: https://github.com/pytorch/pytorch/pull/147418
Approved by: https://github.com/nWEIdia, https://github.com/malfet
2025-02-26 20:52:28 +00:00
cyy
6b60f4bc91 Fix some typos in cuda.cmake (#141462)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141462
Approved by: https://github.com/peterbell10
2024-11-26 01:08:25 +00:00
Jeongseok Lee
3cfd244495 Add USE_SYSTEM_NVTX option (#138287)
## Summary

We are currently [updating](https://github.com/conda-forge/pytorch-cpu-feedstock/pull/277) the [`conda-forge::pytorch`](https://anaconda.org/conda-forge/pytorch) package to version 2.5.0. This update includes a new dependency, the third_party/NVTX submodule. However, like other package management frameworks (e.g., apt), conda-forge prefers using system-installed packages instead of vendor-provided third-party packages.

This pull request aims to add an option, `USE_SYSTEM_NVTX`, to select whether to use the vendored nvtx or the system-installed one, with the default being the vendored one (which is the current behavior).

## Test Plan

The `USE_SYSTEM_NVTX` option is tested by building the `conda-forge::pytorch` package with the change applied as a [patch](cd1d2464dd/recipe/patches/0005-Use-system-nvtx3.patch).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138287
Approved by: https://github.com/albanD
2024-10-19 04:26:01 +00:00
Zitong Zhan
90c821814e SparseCsrCUDA: cuDSS backend for linalg.solve (#129856)
This PR switches to cuDSS library and has the same purpose of #127692, which is to add Sparse CSR tensor support to linalg.solve.
Fixes #69538

Minimum example of usage:
```
import torch

if __name__ == '__main__':
    spd = torch.rand(4, 3)
    A = spd.T @ spd
    b = torch.rand(3).to(torch.float64).cuda()
    A = A.to_sparse_csr().to(torch.float64).cuda()

    x = torch.linalg.solve(A, b)
    print((A @ x - b).norm())

```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129856
Approved by: https://github.com/amjames, https://github.com/lezcano, https://github.com/huydhn

Co-authored-by: Zihang Fang <zhfang1108@gmail.com>
Co-authored-by: Huy Do <huydhn@gmail.com>
2024-08-22 07:57:30 +00:00
cyy
c3d02fa390 [Reland2] Update NVTX to NVTX3 (#109843)
Another attempt to update NVTX to NVTX3. We now avoid changing NVTX header inclusion of existing code.  The advantage of NVTX3 over NVTX is that it is a header-only library so that linking with NVTX3 can greatly simplify our CMake and other building scripts for finding libraries in user environments. In addition, NVTX are indeed still present in the latest CUDA versions, but they're no longer a compiled library: It's now a header-only library. That's why there isn't a .lib file anymore.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109843
Approved by: https://github.com/peterbell10, https://github.com/eqy

Co-authored-by: Ivan Zaitsev <108101595+izaitsevfb@users.noreply.github.com>
2024-08-20 16:33:26 +00:00
Mikayla Gawarecki
018e48c337 [Reland] Add wrappers for synchronous GPUDirect Storage APIs (#133489)
Reland #130633

USE_CUFILE turned off by default in this version
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133489
Approved by: https://github.com/albanD
2024-08-15 17:11:52 +00:00
PyTorch MergeBot
e191b83462 Revert "Add wrappers for synchronous GPUDirect Storage APIs (#130633)"
This reverts commit 709ddf7a9d.

Reverted https://github.com/pytorch/pytorch/pull/130633 on behalf of https://github.com/clee2000 due to still failing internally D60265673 ([comment](https://github.com/pytorch/pytorch/pull/130633#issuecomment-2253239607))
2024-07-26 18:08:20 +00:00
Mikayla Gawarecki
709ddf7a9d Add wrappers for synchronous GPUDirect Storage APIs (#130633)
Based in part on https://github.com/NVIDIA/apex/pull/1774

Differential Revision: [D60155434](https://our.internmc.facebook.com/intern/diff/D60155434)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130633
Approved by: https://github.com/albanD
2024-07-25 22:23:38 +00:00
PyTorch MergeBot
e4b5645f83 Revert "Add wrappers for synchronous GPUDirect Storage APIs (#130633)"
This reverts commit 5b5e0698a5.

Reverted https://github.com/pytorch/pytorch/pull/130633 on behalf of https://github.com/clee2000 due to breaking a lot of jobs and build rules internally D60085885, possibly needs to update some bazel build? ([comment](https://github.com/pytorch/pytorch/pull/130633#issuecomment-2245806738))
2024-07-23 17:19:34 +00:00
Mikayla Gawarecki
5b5e0698a5 Add wrappers for synchronous GPUDirect Storage APIs (#130633)
Based in part on https://github.com/NVIDIA/apex/pull/1774

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130633
Approved by: https://github.com/albanD
2024-07-22 14:51:24 +00:00
Nikita Shulga
c547b2e871 Fix python detection in cuda.cmake (#130651)
If Python package has not been detected previously, call it here

This fixes regression introduced by https://github.com/pytorch/pytorch/pull/128801 that results in annoying, but harmless warning reported in https://github.com/pytorch/pytorch/issues/129777

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130651
Approved by: https://github.com/Skylion007
2024-07-15 03:45:31 +00:00
cyy
479ce5e2f4 Remove outdated CUDA code from CMake (#128801)
It's possible to simplify some CUDA handling logic in CMake.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/128801
Approved by: https://github.com/r-barnes, https://github.com/malfet
2024-06-21 15:00:00 +00:00
Nikita Shulga
0910429d72 [BE][CMake] Use FindPython module (#124613)
As FindPythonInterp and FindPythonLibs has been deprecated since cmake-3.12

Replace `PYTHON_EXECUTABLE` with `Python_EXECUTABLE` everywhere (CMake variable names are case-sensitive)

This makes PyTorch buildable with python3 binary shipped with XCode on MacOS

TODO: Get rid of `FindNumpy` as its part of Python package
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124613
Approved by: https://github.com/cyyever, https://github.com/Skylion007
2024-05-29 13:17:35 +00:00
cyy
e4b245292f Remove caffe2::tensorrt target code from cuda.cmake (#127204)
Following #126542.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/127204
Approved by: https://github.com/ezyang
2024-05-28 04:42:14 +00:00
Eddie Yan
967dd31621 [cuDNN] Cleanup cuDNN < 8.1 ifdefs (#120862)
Follow-up of #95722

Pull Request resolved: https://github.com/pytorch/pytorch/pull/120862
Approved by: https://github.com/Skylion007
2024-03-07 01:46:25 +00:00
PyTorch MergeBot
ee96399bb4 Revert "[Reland2] Update NVTX to NVTX3 (#109843)"
This reverts commit dcb486232d.

Reverted https://github.com/pytorch/pytorch/pull/109843 on behalf of https://github.com/atalman due to Diff broke internal builds and tests ([comment](https://github.com/pytorch/pytorch/pull/109843#issuecomment-1841105398))
2023-12-05 16:10:20 +00:00
cyyever
dcb486232d [Reland2] Update NVTX to NVTX3 (#109843)
Another attempt to update NVTX to NVTX3. We now avoid changing NVTX header inclusion of existing code.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109843
Approved by: https://github.com/peterbell10
2023-12-04 19:02:07 +00:00
Peter Bell
93cea394de CMake: Loosen CUDA consistency check (#113174)
Closes #108931, closes #108932, see also conda-forge/pytorch-cpu-feedstock#203

Currently we compare `CUDA_INCLUDE_DIRS` and expect exact equality
with `CUDAToolkit_INCLUDE_DIR` however this fails in the presense of
symbolic links or for split installs where there are multiple include paths.
Given that, it makes sense to loosen the requirement to just version
equality under the assumption that two installs of the same version
should still be compatible.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113174
Approved by: https://github.com/malfet
2023-11-08 02:51:18 +00:00
cyy
a6b452dfdc [2/N] Enable Wunused-result, Wunused-variable and Wmissing-braces in torch targets (#110836)
This PR enables Wunused-result, Wunused-variable and Wmissing-braces because our code base is clean.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110836
Approved by: https://github.com/Skylion007
2023-10-11 23:49:15 +00:00
PyTorch MergeBot
22cade56ba Revert "[Reland] Upgrade NVTX to NVTX3 (#97582)"
This reverts commit 5bbfb96203.

Reverted https://github.com/pytorch/pytorch/pull/97582 on behalf of https://github.com/izaitsevfb due to Breaks meta RL builds ([comment](https://github.com/pytorch/pytorch/pull/97582#issuecomment-1679568525))
2023-08-15 20:55:12 +00:00
cyy
5bbfb96203 [Reland] Upgrade NVTX to NVTX3 (#97582)
PR #90689 replaces NVTX with NVTX3. However, the torch::nvtoolsext is created only when the third party NVTX is used.
 This is clear a logical error. We now move the creation code out of the branch to cover all cases. This should fix the issues reported in the comments of  #90689.

It would be better to move configurations of the failed FRL jobs to CI tests so that we can find such issues early before merging.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97582
Approved by: https://github.com/peterbell10
2023-08-14 16:55:25 +00:00
Jesse Cai
f81f9093ec [core][pruning][feature] cuSPARSELt build integration (#103700)
Summary:

This stack of PR's integrates cuSPARSELt into PyTorch.

This PR adds support for cuSPARSELt into the build process.
It adds in a new flag, USE_CUSPARSELT that defaults to false.

When USE_CUSPASRELT=1 is specified, the user can also specify
CUSPASRELT_ROOT, which defines the path to the library.

Compiling pytorch with cusparselt support can be done as follows:

``
USE_CUSPARSELT=1
CUSPARSELT_ROOT=/path/to/cusparselt

python setup.py develop
```

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103700
Approved by: https://github.com/albanD
2023-08-02 12:48:39 +00:00
Te
a73ad82c8f conditional CMAKE_CUDA_STANDARD (#104240)
Fixes #104237

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104240
Approved by: https://github.com/malfet
2023-06-27 18:41:25 +00:00
cyy
c8877e6080 enable some cuda warnings (#95568)
Currently some CUDA warnings are disabled due to  some old issues of code quality that are fixed now. So it is time to remove the suppression.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95568
Approved by: https://github.com/albanD
2023-04-28 02:39:17 +00:00
PyTorch MergeBot
5170995b2a Revert "Upgrade NVTX to NVTX3 (#90689)"
This reverts commit e64ddd1ab9.

Reverted https://github.com/pytorch/pytorch/pull/90689 on behalf of https://github.com/osalpekar due to Build Failures due to not being able to find one nvtx3 header in FRL jobs: [D42332540](https://www.internalfb.com/diff/D42332540)
2023-03-24 18:16:06 +00:00
cyy
e64ddd1ab9 Upgrade NVTX to NVTX3 (#90689)
Due to recent upgrade to CUDA 11, we can upgrade NVTX to NVTX3 as well, which is a header only library that can simplify the building system a lot.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90689
Approved by: https://github.com/soumith, https://github.com/malfet
2023-03-23 01:56:42 +00:00
Peter Bell
c5f6092591 Use FindCUDAToolkit to find cuda dependencies (#82695)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82695
Approved by: https://github.com/malfet
2023-03-01 17:26:36 +00:00
PyTorch MergeBot
801b3f8fc7 Revert "Use FindCUDAToolkit to find cuda dependencies (#82695)"
This reverts commit 7289d22d67.

Reverted https://github.com/pytorch/pytorch/pull/82695 on behalf of https://github.com/peterbell10 due to Breaks torchaudio build
2023-02-28 02:29:09 +00:00
Peter Bell
7289d22d67 Use FindCUDAToolkit to find cuda dependencies (#82695)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82695
Approved by: https://github.com/malfet
2023-02-21 22:35:17 +00:00
cyy
5fa7120722 Simplify CMake CUDNN code (#91676)
1. Move CUDNN code to seperate module.
2. Merge CUDNN public and private targets into a single private target. There is no need to expose CUDNN dependency.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91676
Approved by: https://github.com/malfet
2023-02-08 01:06:10 +00:00
cyy
9291f9b9e2 Simplify cmake code (#91546)
We use various newer CMake features to simplify build system:
1.Caffe2::threads is replaced by threads::threads.
2.Some unused MSVC flags are removed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91546
Approved by: https://github.com/malfet, https://github.com/Skylion007
2023-02-08 01:05:19 +00:00
cyy
afd7b581aa Simplify OpenMP detection in CMake (#91576)
We greatly simplify the handing of OpenMP in CMake by using caffe2::openmp target thoroughly. We follow the old behavior by defaulting to MKL OMP library and detecting OMP flags otherwise.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91576
Approved by: https://github.com/malfet
2023-02-04 11:50:06 +00:00
cyy
9710ac6531 Some CMake and CUDA cleanup given recent update to C++17 (#90599)
The main changes are:
1. Remove outdated checks for old compiler versions because they can't support C++17.
2. Remove outdated CMake checks because it now requires 3.18.
3. Remove outdated CUDA checks because we are moving to CUDA 11.

Almost all changes are in CMake files for easy audition.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90599
Approved by: https://github.com/soumith
2022-12-30 11:19:26 +00:00
PyTorch MergeBot
deb414a43f Revert "Use FindCUDAToolkit to find cuda dependencies (#82695)"
This reverts commit fb9b96593c.

Reverted https://github.com/pytorch/pytorch/pull/82695 on behalf of https://github.com/malfet due to Break cublas packaging into wheel
2022-10-11 02:50:47 +00:00
Peter Bell
fb9b96593c Use FindCUDAToolkit to find cuda dependencies (#82695)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82695
Approved by: https://github.com/malfet
2022-10-06 15:43:39 +00:00
atalman
0e25a9490b Removing cublas static linking (#79280)
Removing cublas static linking

Test:  https://github.com/pytorch/pytorch/runs/6837323424?check_suite_focus=true

```
(base) atalman@atalman-dev-workstation-d4c889c8-2k8hl:~/whl_test/torch/lib$ ldd libtorch_cuda.so
	linux-vdso.so.1 (0x00007fffe8f6a000)
	libc10_cuda.so (0x00007f6539e6a000)
	libcudart-80664282.so.10.2 (0x00007f6539be9000)
	libnvToolsExt-3965bdd0.so.1 (0x00007f65399df000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f65397c0000)
	libc10.so (0x00007f653952f000)
	libtorch_cpu.so (0x00007f6520921000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f6520583000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f652037f000)
	libcublas.so.10 (0x00007f651c0c5000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f651bebd000)
	libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f651bb34000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f651b91c000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f651b52b000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f656aa13000)
	libgomp-a34b3233.so.1 (0x00007f651b301000)
	libcublasLt.so.10 (0x00007f651946c000)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79280
Approved by: https://github.com/seemethere
2022-06-13 13:10:16 +00:00
Nikita Shulga
80ea6955af Add cuda-11.3+clang9 build workflow (take 2)
To be able to detect unused captures in GPU code lambdas (as gcc does not support this diagnostic)

Remove unused opts lambda capture in `ProcessGroupMPI.cpp` and `Distributions.cu`

Fix sign-compare in nvfuser benchmark and ignore signed unsigned comparison in nvfuser tests
Fixes https://github.com/pytorch/pytorch/issues/75475 by aliasing CMAKE_CUDA_HOST_COMPILER to C_COMPILER when clang is used
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75293
Approved by: https://github.com/atalman, https://github.com/seemethere
2022-04-11 17:13:01 +00:00
PyTorch MergeBot
8fe43d76d5 Revert "Add cuda-11.3+clang9 build workflow"
This reverts commit 709fcc862e.

Reverted https://github.com/pytorch/pytorch/pull/75293 on behalf of https://github.com/janeyx99
2022-04-11 15:24:59 +00:00
Nikita Shulga
709fcc862e Add cuda-11.3+clang9 build workflow
To be able to detect unused captures in GPU code lambdas (as gcc does not support this diagnostic)

Remove unused opts lambda capture in `ProcessGroupMPI.cpp` and `Distributions.cu`

Fix sign-compare in nvfuser benchmark and ignore signed unsigned comparison in nvfuser tests
Fixes https://github.com/pytorch/pytorch/issues/75475 by aliasing CMAKE_CUDA_HOST_COMPILER to C_COMPILER when clang is used
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75293
Approved by: https://github.com/atalman, https://github.com/seemethere
2022-04-11 14:10:57 +00:00
Andrey Talman
197764b35d Remove cuda 11.1 references (#73514)
Summary:
Fixes : https://github.com/pytorch/pytorch/issues/73377

We've migrated to CUDA-11.3 as default toolkit in 1.9, it's time to stop builds (especially considering forward-compatibility guarantee across CUDA-11.x drivers)

Hence we are removing CUDA 11.1 support. We should also cleanup old cuda related code from our builder and pytorch repo making scripts a little more clean.

We have code that references cuda 9.2 , 10.1 , 11.0, 11.1, 11.2 and none of these are currently use

Pull Request resolved: https://github.com/pytorch/pytorch/pull/73514

Reviewed By: janeyx99

Differential Revision: D34551989

Pulled By: atalman

fbshipit-source-id: 9ceaaa9b25ad49689986f4b29a26d20370d9d011
(cherry picked from commit fe109c62daf429e9053c03f6e374568ba23cd041)
2022-03-01 16:37:37 +00:00
Andrey Talman
1e7d20eaea Remove forcing CUDNN_STATIC when CAFFE2_STATIC_LINK_CUDA (#72290)
Summary:
Remove forcing CUDNN_STATIC when CAFFE2_STATIC_LINK_CUDA is set
Since we are transitioning to using dynamic loading for multiple pytorch dependecies  and CUDNN is the first step in this transition,  hence we want to remove forcing CUDNN to statically load, and instead load it dynamically.

Tested using following workflow:
https://github.com/pytorch/pytorch/actions/runs/1790666862

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72290

Reviewed By: albanD

Differential Revision: D34003793

Pulled By: atalman

fbshipit-source-id: 41bda7ac019a612ee53ceb18d1e372b1bb3cb68e
(cherry picked from commit 4a01940e68)
2022-02-04 14:35:53 +00:00
Nikita Shulga
c373387709 Update CMake and use native CUDA language support (#62445)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62445

PyTorch currently uses the old style of compiling CUDA in CMake which is just a
bunch of scripts in `FindCUDA.cmake`. Newer versions support CUDA natively as
a language just like C++ or C.

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D31503350

fbshipit-source-id: 2ee817edc9698531ae1b87eda3ad271ee459fd55
2021-10-11 09:05:48 -07:00
Jane Xu
9af6fe991c Remove CUDA 9.2 and older references from our cmake (#65065)
Summary:
Removes old CUDA references in our cuda.cmake

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65065

Reviewed By: malfet

Differential Revision: D30992673

Pulled By: janeyx99

fbshipit-source-id: 85b524089ed57e5acbc71720267cf05e24a8c20a
2021-09-16 12:54:49 -07:00
Luca Wehrstedt
c830db0265 Raise error in CMake for CUDA <9.2 (#61462)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61462

Anything before CUDA 9.2 is not supported (see https://github.com/pytorch/pytorch/pull/36848), and perhaps not even that.
ghstack-source-id: 133312018

Test Plan: CI

Reviewed By: samestep

Differential Revision: D29637251

fbshipit-source-id: 4300169b7298274b2074649342902a34bd2220b5
2021-07-09 11:28:38 -07:00
shmsong
ee2dd35ef4 Resolving native dependency and try_run for cross compile (#59764)
Summary:
This is a PR on build system that provides support for cross compiling on Jetson platforms.

The major change is:

1. Disable try runs for cross compiling in `COMPILER_WORKS`, `BLAS`, and `CUDA`. They will not be able to perform try run on a cross compile setup

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59764

Reviewed By: soulitzer

Differential Revision: D29524363

Pulled By: malfet

fbshipit-source-id: f06d1ad30b704c9a17d77db686c65c0754db07b8
2021-07-09 09:29:21 -07:00
Nikita Shulga
1ea5c19c19 Add USE_WHOLE_CUDNN option (#59744)
Summary:
It is only enabled if USE_STATIC_CUDNN is enabled

Next step after https://github.com/pytorch/pytorch/pull/59721 towards resolving fast kernels stripping reported in https://github.com/pytorch/pytorch/issues/50153

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59744

Reviewed By: seemethere, ngimel

Differential Revision: D29007314

Pulled By: malfet

fbshipit-source-id: 7091e299c0c6cc2a8aa82fbf49312cecf3bb861a
2021-06-09 21:12:42 -07:00
Nikita Shulga
8845cbabf0 [CMake] Split caffe2::cudnn into public and private (#59721)
Summary:
This is only important for builds where cuDNN is linked statically into libtorch_cpu.
Before this PR PyTorch wheels often accidentally contained several partial copies of cudnn_static library.
Splitting the interface into header only (cudnn-public) and library+headers(cudnn-private) prevents those from happening.
Preliminary step towards enabling optional linking whole cudnn_library to workaround issue reported in https://github.com/pytorch/pytorch/issues/50153

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59721

Reviewed By: ngimel

Differential Revision: D29000967

Pulled By: malfet

fbshipit-source-id: f054df92b265e9494076ab16c247427b39da9336
2021-06-09 13:18:48 -07:00
Nikita Shulga
2dda8d7571 Move cublas dependency after CuDNN (#58287)
Summary:
Library linking order matters during static linking
Not sure whether its a bug or a feature, but if cublas is reference
before CuDNN, it will be partially statically linked into the library,
even if it is not used

Pull Request resolved: https://github.com/pytorch/pytorch/pull/58287

Reviewed By: janeyx99

Differential Revision: D28433165

Pulled By: malfet

fbshipit-source-id: 8dffa0533075126dc383428f838f7d048074205c
2021-05-24 09:39:09 -07:00
Nikita Shulga
133d8abbfc Compute nvrtc during libtorch build (#57579)
Summary:
The warning is completely harmless, but it still its nice not to emit it
when it could be computed.

Fixes https://github.com/pytorch/pytorch/issues/53350

Pull Request resolved: https://github.com/pytorch/pytorch/pull/57579

Reviewed By: walterddr

Differential Revision: D28208938

Pulled By: malfet

fbshipit-source-id: 8dcc3f1bff7c5ed2c0157268c3063228d3c445b6
2021-05-04 22:51:24 -07:00
Nikita Shulga
08017f4598 Add explicit cudart_static dependency for cublas_static (#52509)
Summary:
Fixes following error during static linking, by enforcing that cudart dependency is put after cublasLt
```
/usr/bin/ld: /usr/local/cuda/lib64/libcublasLt_static.a(libcublasLt_static.a.o): undefined reference to symbol 'cudaStreamWaitEvent@libcudart.so.11.0'
/usr/local/cuda/lib64/libcudart.so: error adding symbols: DSO missing from command line
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/52509

Reviewed By: janeyx99

Differential Revision: D26547622

Pulled By: malfet

fbshipit-source-id: 4e17f18cf0ab5479a549299faf2583a79fbda4b9
2021-02-19 10:45:49 -08:00