Commit Graph

122 Commits

Author SHA1 Message Date
PyTorch MergeBot
22cade56ba Revert "[Reland] Upgrade NVTX to NVTX3 (#97582)"
This reverts commit 5bbfb96203.

Reverted https://github.com/pytorch/pytorch/pull/97582 on behalf of https://github.com/izaitsevfb due to Breaks meta RL builds ([comment](https://github.com/pytorch/pytorch/pull/97582#issuecomment-1679568525))
2023-08-15 20:55:12 +00:00
cyy
5bbfb96203 [Reland] Upgrade NVTX to NVTX3 (#97582)
PR #90689 replaces NVTX with NVTX3. However, the torch::nvtoolsext is created only when the third party NVTX is used.
 This is clear a logical error. We now move the creation code out of the branch to cover all cases. This should fix the issues reported in the comments of  #90689.

It would be better to move configurations of the failed FRL jobs to CI tests so that we can find such issues early before merging.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97582
Approved by: https://github.com/peterbell10
2023-08-14 16:55:25 +00:00
Jesse Cai
f81f9093ec [core][pruning][feature] cuSPARSELt build integration (#103700)
Summary:

This stack of PR's integrates cuSPARSELt into PyTorch.

This PR adds support for cuSPARSELt into the build process.
It adds in a new flag, USE_CUSPARSELT that defaults to false.

When USE_CUSPASRELT=1 is specified, the user can also specify
CUSPASRELT_ROOT, which defines the path to the library.

Compiling pytorch with cusparselt support can be done as follows:

``
USE_CUSPARSELT=1
CUSPARSELT_ROOT=/path/to/cusparselt

python setup.py develop
```

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103700
Approved by: https://github.com/albanD
2023-08-02 12:48:39 +00:00
Te
a73ad82c8f conditional CMAKE_CUDA_STANDARD (#104240)
Fixes #104237

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104240
Approved by: https://github.com/malfet
2023-06-27 18:41:25 +00:00
cyy
c8877e6080 enable some cuda warnings (#95568)
Currently some CUDA warnings are disabled due to  some old issues of code quality that are fixed now. So it is time to remove the suppression.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95568
Approved by: https://github.com/albanD
2023-04-28 02:39:17 +00:00
PyTorch MergeBot
5170995b2a Revert "Upgrade NVTX to NVTX3 (#90689)"
This reverts commit e64ddd1ab9.

Reverted https://github.com/pytorch/pytorch/pull/90689 on behalf of https://github.com/osalpekar due to Build Failures due to not being able to find one nvtx3 header in FRL jobs: [D42332540](https://www.internalfb.com/diff/D42332540)
2023-03-24 18:16:06 +00:00
cyy
e64ddd1ab9 Upgrade NVTX to NVTX3 (#90689)
Due to recent upgrade to CUDA 11, we can upgrade NVTX to NVTX3 as well, which is a header only library that can simplify the building system a lot.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90689
Approved by: https://github.com/soumith, https://github.com/malfet
2023-03-23 01:56:42 +00:00
Peter Bell
c5f6092591 Use FindCUDAToolkit to find cuda dependencies (#82695)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82695
Approved by: https://github.com/malfet
2023-03-01 17:26:36 +00:00
PyTorch MergeBot
801b3f8fc7 Revert "Use FindCUDAToolkit to find cuda dependencies (#82695)"
This reverts commit 7289d22d67.

Reverted https://github.com/pytorch/pytorch/pull/82695 on behalf of https://github.com/peterbell10 due to Breaks torchaudio build
2023-02-28 02:29:09 +00:00
Peter Bell
7289d22d67 Use FindCUDAToolkit to find cuda dependencies (#82695)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82695
Approved by: https://github.com/malfet
2023-02-21 22:35:17 +00:00
cyy
5fa7120722 Simplify CMake CUDNN code (#91676)
1. Move CUDNN code to seperate module.
2. Merge CUDNN public and private targets into a single private target. There is no need to expose CUDNN dependency.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91676
Approved by: https://github.com/malfet
2023-02-08 01:06:10 +00:00
cyy
9291f9b9e2 Simplify cmake code (#91546)
We use various newer CMake features to simplify build system:
1.Caffe2::threads is replaced by threads::threads.
2.Some unused MSVC flags are removed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91546
Approved by: https://github.com/malfet, https://github.com/Skylion007
2023-02-08 01:05:19 +00:00
cyy
afd7b581aa Simplify OpenMP detection in CMake (#91576)
We greatly simplify the handing of OpenMP in CMake by using caffe2::openmp target thoroughly. We follow the old behavior by defaulting to MKL OMP library and detecting OMP flags otherwise.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91576
Approved by: https://github.com/malfet
2023-02-04 11:50:06 +00:00
cyy
9710ac6531 Some CMake and CUDA cleanup given recent update to C++17 (#90599)
The main changes are:
1. Remove outdated checks for old compiler versions because they can't support C++17.
2. Remove outdated CMake checks because it now requires 3.18.
3. Remove outdated CUDA checks because we are moving to CUDA 11.

Almost all changes are in CMake files for easy audition.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90599
Approved by: https://github.com/soumith
2022-12-30 11:19:26 +00:00
PyTorch MergeBot
deb414a43f Revert "Use FindCUDAToolkit to find cuda dependencies (#82695)"
This reverts commit fb9b96593c.

Reverted https://github.com/pytorch/pytorch/pull/82695 on behalf of https://github.com/malfet due to Break cublas packaging into wheel
2022-10-11 02:50:47 +00:00
Peter Bell
fb9b96593c Use FindCUDAToolkit to find cuda dependencies (#82695)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82695
Approved by: https://github.com/malfet
2022-10-06 15:43:39 +00:00
atalman
0e25a9490b Removing cublas static linking (#79280)
Removing cublas static linking

Test:  https://github.com/pytorch/pytorch/runs/6837323424?check_suite_focus=true

```
(base) atalman@atalman-dev-workstation-d4c889c8-2k8hl:~/whl_test/torch/lib$ ldd libtorch_cuda.so
	linux-vdso.so.1 (0x00007fffe8f6a000)
	libc10_cuda.so (0x00007f6539e6a000)
	libcudart-80664282.so.10.2 (0x00007f6539be9000)
	libnvToolsExt-3965bdd0.so.1 (0x00007f65399df000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f65397c0000)
	libc10.so (0x00007f653952f000)
	libtorch_cpu.so (0x00007f6520921000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f6520583000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f652037f000)
	libcublas.so.10 (0x00007f651c0c5000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f651bebd000)
	libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f651bb34000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f651b91c000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f651b52b000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f656aa13000)
	libgomp-a34b3233.so.1 (0x00007f651b301000)
	libcublasLt.so.10 (0x00007f651946c000)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79280
Approved by: https://github.com/seemethere
2022-06-13 13:10:16 +00:00
Nikita Shulga
80ea6955af Add cuda-11.3+clang9 build workflow (take 2)
To be able to detect unused captures in GPU code lambdas (as gcc does not support this diagnostic)

Remove unused opts lambda capture in `ProcessGroupMPI.cpp` and `Distributions.cu`

Fix sign-compare in nvfuser benchmark and ignore signed unsigned comparison in nvfuser tests
Fixes https://github.com/pytorch/pytorch/issues/75475 by aliasing CMAKE_CUDA_HOST_COMPILER to C_COMPILER when clang is used
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75293
Approved by: https://github.com/atalman, https://github.com/seemethere
2022-04-11 17:13:01 +00:00
PyTorch MergeBot
8fe43d76d5 Revert "Add cuda-11.3+clang9 build workflow"
This reverts commit 709fcc862e.

Reverted https://github.com/pytorch/pytorch/pull/75293 on behalf of https://github.com/janeyx99
2022-04-11 15:24:59 +00:00
Nikita Shulga
709fcc862e Add cuda-11.3+clang9 build workflow
To be able to detect unused captures in GPU code lambdas (as gcc does not support this diagnostic)

Remove unused opts lambda capture in `ProcessGroupMPI.cpp` and `Distributions.cu`

Fix sign-compare in nvfuser benchmark and ignore signed unsigned comparison in nvfuser tests
Fixes https://github.com/pytorch/pytorch/issues/75475 by aliasing CMAKE_CUDA_HOST_COMPILER to C_COMPILER when clang is used
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75293
Approved by: https://github.com/atalman, https://github.com/seemethere
2022-04-11 14:10:57 +00:00
Andrey Talman
197764b35d Remove cuda 11.1 references (#73514)
Summary:
Fixes : https://github.com/pytorch/pytorch/issues/73377

We've migrated to CUDA-11.3 as default toolkit in 1.9, it's time to stop builds (especially considering forward-compatibility guarantee across CUDA-11.x drivers)

Hence we are removing CUDA 11.1 support. We should also cleanup old cuda related code from our builder and pytorch repo making scripts a little more clean.

We have code that references cuda 9.2 , 10.1 , 11.0, 11.1, 11.2 and none of these are currently use

Pull Request resolved: https://github.com/pytorch/pytorch/pull/73514

Reviewed By: janeyx99

Differential Revision: D34551989

Pulled By: atalman

fbshipit-source-id: 9ceaaa9b25ad49689986f4b29a26d20370d9d011
(cherry picked from commit fe109c62daf429e9053c03f6e374568ba23cd041)
2022-03-01 16:37:37 +00:00
Andrey Talman
1e7d20eaea Remove forcing CUDNN_STATIC when CAFFE2_STATIC_LINK_CUDA (#72290)
Summary:
Remove forcing CUDNN_STATIC when CAFFE2_STATIC_LINK_CUDA is set
Since we are transitioning to using dynamic loading for multiple pytorch dependecies  and CUDNN is the first step in this transition,  hence we want to remove forcing CUDNN to statically load, and instead load it dynamically.

Tested using following workflow:
https://github.com/pytorch/pytorch/actions/runs/1790666862

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72290

Reviewed By: albanD

Differential Revision: D34003793

Pulled By: atalman

fbshipit-source-id: 41bda7ac019a612ee53ceb18d1e372b1bb3cb68e
(cherry picked from commit 4a01940e68)
2022-02-04 14:35:53 +00:00
Nikita Shulga
c373387709 Update CMake and use native CUDA language support (#62445)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62445

PyTorch currently uses the old style of compiling CUDA in CMake which is just a
bunch of scripts in `FindCUDA.cmake`. Newer versions support CUDA natively as
a language just like C++ or C.

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D31503350

fbshipit-source-id: 2ee817edc9698531ae1b87eda3ad271ee459fd55
2021-10-11 09:05:48 -07:00
Jane Xu
9af6fe991c Remove CUDA 9.2 and older references from our cmake (#65065)
Summary:
Removes old CUDA references in our cuda.cmake

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65065

Reviewed By: malfet

Differential Revision: D30992673

Pulled By: janeyx99

fbshipit-source-id: 85b524089ed57e5acbc71720267cf05e24a8c20a
2021-09-16 12:54:49 -07:00
Luca Wehrstedt
c830db0265 Raise error in CMake for CUDA <9.2 (#61462)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61462

Anything before CUDA 9.2 is not supported (see https://github.com/pytorch/pytorch/pull/36848), and perhaps not even that.
ghstack-source-id: 133312018

Test Plan: CI

Reviewed By: samestep

Differential Revision: D29637251

fbshipit-source-id: 4300169b7298274b2074649342902a34bd2220b5
2021-07-09 11:28:38 -07:00
shmsong
ee2dd35ef4 Resolving native dependency and try_run for cross compile (#59764)
Summary:
This is a PR on build system that provides support for cross compiling on Jetson platforms.

The major change is:

1. Disable try runs for cross compiling in `COMPILER_WORKS`, `BLAS`, and `CUDA`. They will not be able to perform try run on a cross compile setup

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59764

Reviewed By: soulitzer

Differential Revision: D29524363

Pulled By: malfet

fbshipit-source-id: f06d1ad30b704c9a17d77db686c65c0754db07b8
2021-07-09 09:29:21 -07:00
Nikita Shulga
1ea5c19c19 Add USE_WHOLE_CUDNN option (#59744)
Summary:
It is only enabled if USE_STATIC_CUDNN is enabled

Next step after https://github.com/pytorch/pytorch/pull/59721 towards resolving fast kernels stripping reported in https://github.com/pytorch/pytorch/issues/50153

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59744

Reviewed By: seemethere, ngimel

Differential Revision: D29007314

Pulled By: malfet

fbshipit-source-id: 7091e299c0c6cc2a8aa82fbf49312cecf3bb861a
2021-06-09 21:12:42 -07:00
Nikita Shulga
8845cbabf0 [CMake] Split caffe2::cudnn into public and private (#59721)
Summary:
This is only important for builds where cuDNN is linked statically into libtorch_cpu.
Before this PR PyTorch wheels often accidentally contained several partial copies of cudnn_static library.
Splitting the interface into header only (cudnn-public) and library+headers(cudnn-private) prevents those from happening.
Preliminary step towards enabling optional linking whole cudnn_library to workaround issue reported in https://github.com/pytorch/pytorch/issues/50153

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59721

Reviewed By: ngimel

Differential Revision: D29000967

Pulled By: malfet

fbshipit-source-id: f054df92b265e9494076ab16c247427b39da9336
2021-06-09 13:18:48 -07:00
Nikita Shulga
2dda8d7571 Move cublas dependency after CuDNN (#58287)
Summary:
Library linking order matters during static linking
Not sure whether its a bug or a feature, but if cublas is reference
before CuDNN, it will be partially statically linked into the library,
even if it is not used

Pull Request resolved: https://github.com/pytorch/pytorch/pull/58287

Reviewed By: janeyx99

Differential Revision: D28433165

Pulled By: malfet

fbshipit-source-id: 8dffa0533075126dc383428f838f7d048074205c
2021-05-24 09:39:09 -07:00
Nikita Shulga
133d8abbfc Compute nvrtc during libtorch build (#57579)
Summary:
The warning is completely harmless, but it still its nice not to emit it
when it could be computed.

Fixes https://github.com/pytorch/pytorch/issues/53350

Pull Request resolved: https://github.com/pytorch/pytorch/pull/57579

Reviewed By: walterddr

Differential Revision: D28208938

Pulled By: malfet

fbshipit-source-id: 8dcc3f1bff7c5ed2c0157268c3063228d3c445b6
2021-05-04 22:51:24 -07:00
Nikita Shulga
08017f4598 Add explicit cudart_static dependency for cublas_static (#52509)
Summary:
Fixes following error during static linking, by enforcing that cudart dependency is put after cublasLt
```
/usr/bin/ld: /usr/local/cuda/lib64/libcublasLt_static.a(libcublasLt_static.a.o): undefined reference to symbol 'cudaStreamWaitEvent@libcudart.so.11.0'
/usr/local/cuda/lib64/libcudart.so: error adding symbols: DSO missing from command line
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/52509

Reviewed By: janeyx99

Differential Revision: D26547622

Pulled By: malfet

fbshipit-source-id: 4e17f18cf0ab5479a549299faf2583a79fbda4b9
2021-02-19 10:45:49 -08:00
Nikita Shulga
de4c9ecc35 Fix libnvrtc discoverability in package patched by auditwheel (#52184)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52184

`auditwheel` inserts first 8 symbols of sha256 checksum of the library before relocating into the wheel package. This change adds logic for computing the same short sha sum and embedding it into LazyNVRTC as alternative name for libnvrt.so

Fixes https://github.com/pytorch/pytorch/issues/52075

Test Plan: Imported from OSS

Reviewed By: seemethere

Differential Revision: D26417403

Pulled By: malfet

fbshipit-source-id: e366dd22e95e219979f6c2fa39acb11585b34c72
2021-02-13 19:38:27 -08:00
Nikita Shulga
bf841b25e4 [cmake] Add explicit cublas->cudart dependency (#52243)
Summary:
Necessary to ensure correct link order, especially if libraries are
linked statically. Otherwise, one might run into:
```
/usr/bin/ld: /usr/local/cuda/lib64/libcublasLt_static.a(libcublasLt_static.a.o): undefined reference to symbol 'cudaStreamWaitEvent@libcudart.so.11.0'
/usr/local/cuda/lib64/libcudart.so: error adding symbols: DSO missing from command line
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/52243

Reviewed By: seemethere, ngimel

Differential Revision: D26437159

Pulled By: malfet

fbshipit-source-id: 33b8bb5040bda10537833f3ad737f535488452ea
2021-02-13 18:21:33 -08:00
peterjc123
bb99bea774 Compress NVCC flags for Windows (#45842)
Summary:
Fixes #{issue number}
This makes the command line shorter.
Also updates `randomtemp` in which the previous version has a limitation that the length of the argument cannot exceed 260.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45842

Reviewed By: albanD

Differential Revision: D24137088

Pulled By: ezyang

fbshipit-source-id: f0b4240735306e302eb3887f54a2b7af83c9f5dc
2020-10-07 08:39:15 -07:00
REX51
67889db8aa Replaced BLACKLIST with BLOCKLIST (#45781)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/41714

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45781

Reviewed By: nairbv

Differential Revision: D24136821

Pulled By: albanD

fbshipit-source-id: 0c0223bda0c5b4da75167a27d7859562db396304
2020-10-06 07:49:00 -07:00
Nikita Shulga
31ed468905 Fix cmake warning (#42707)
Summary:
If argumenets in set_target_properties are not separated by whitespace, cmake raises a warning:
```
CMake Warning (dev) at cmake/public/cuda.cmake:269:
  Syntax Warning in cmake code at column 54

  Argument not separated from preceding token by whitespace.
```

Fixes #{issue number}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42707

Reviewed By: ailzhang

Differential Revision: D22988055

Pulled By: malfet

fbshipit-source-id: c3744f23b383d603788cd36f89a8286a46b6c00f
2020-08-07 09:57:21 -07:00
Michael Carilli
0f358fab6b Hide cudnn symbols in libtorch_cuda.so when statically linking cudnn (#41986)
Summary:
This PR intends to fix https://github.com/pytorch/pytorch/issues/32983.

The initial (one-line) diff causes statically linked cudnn symbols in `libtorch_cuda.so` to have local linkage (such that they shouldn't be visible to external libraries during dynamic linking at load time), at least in my source build on Ubuntu 20.04.

Procedure I used to verify:
```
export USE_STATIC_CUDNN=ON
python3 setup.py install
...
```
then
```
mcarilli@mcarilli-desktop:~/Desktop/mcarilli_github/pytorch/torch/lib$ nm libtorch_cuda.so | grep cudnnCreate
00000000031ff540 t cudnnCreate
00000000031fbe70 t cudnnCreateActivationDescriptor
```
Before the diff they were marked with capital `T`s indicating external linkage.

Caveats:
- The fix is gcc-specific afaik.  I have no idea how to enable it for Windows or other compilers.
- Hiding the cudnn symbols will break external C++ applications that rely on linking `libtorch.so` to supply cudnn symbol definitions.  IMO this is "off menu" usage so I don't think it's a major concern.  Hiding the symbols _won't_ break applications that call cudnn indirectly through torch functions, which IMO is the "on menu" way.
- I know _very little_ about the build system.  The diff's intent is to add a link option that applies to any Pytorch `.so`s that statically link cudnn, and does so on Linux only.  I'm blindly following soumith 's recommendation https://github.com/pytorch/pytorch/issues/32983#issuecomment-662056151, and post-checking the built libs (I also added `set(CMAKE_VERBOSE_MAKEFILE ON)` to the top-level CMakeLists.txt at one point to confirm `-Wl,--exclude-libs,libcudnn_static.a` was picked up by the command that linked `libtorch_cuda.so`).
- https://github.com/pytorch/pytorch/issues/32983 (which used a Pytorch 1.4 binary build) complained about `libtorch.so`, not `libtorch_cuda.so`:
    ```
    nvpohanh@ubuntu:~$ nm /usr/local/lib/python3.5/dist-packages/torch/lib/libtorch.so | grep ' cudnnCreate'
    000000000f479c30 T cudnnCreate
    000000000f475ff0 T cudnnCreateActivationDescriptor
    ```
  In my source build, `libtorch.so` ends up small, containing no cudnn symbols (this is true with or without the PR's diff), which contradicts https://github.com/pytorch/pytorch/issues/32983.  Maybe the symbol organization (what goes in   `libtorch.so` vs `libtorch_cuda/cpu/whatever.so`) changed since 1.4.  Or maybe the symbol organization is different for source vs binary builds, in which case I have no idea if this PR's diff has the same effect for a binary build.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41986

Reviewed By: glaringlee

Differential Revision: D22934926

Pulled By: malfet

fbshipit-source-id: 711475834e0f8148f0e5f2fe28fca5f138ef494b
2020-08-04 22:59:40 -07:00
Alexander Grund
a4b831a86a Replace if(NOT ${var}) by if(NOT var) (#41924)
Summary:
As explained in https://github.com/pytorch/pytorch/issues/41922 using `if(NOT ${var})" is usually wrong and can lead to issues like https://github.com/pytorch/pytorch/issues/41922 where the condition is wrongly evaluated to FALSE instead of TRUE. Instead the unevaluated variable name should be used in all cases, see the CMake docu for details.

This fixes the `NOT ${var}` cases by using a simple regexp replacement. It seems `pybind11_PREFER_third_party` is the only variable really prone to causing an issue as all others are set. However due to CMake evaluating unquoted strings in `if` conditions as variable names I recommend to never use unquoted `${var}` in an if condition. A similar regexp based replacement could be done on the whole codebase but as that does a lot of changes I didn't include this now. Also `if(${var})` will likely lead to a parser error if `var` is unset instead of a wrong result

Fixes https://github.com/pytorch/pytorch/issues/41922

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41924

Reviewed By: seemethere

Differential Revision: D22700229

Pulled By: mrshenli

fbshipit-source-id: e2b3466039e4312887543c2e988270547a91c439
2020-07-23 15:49:20 -07:00
peter
6e4f99b063 Fix wrong MSVC version constraint for CUDA 9.2 (#40794)
Summary:
Tested with https://github.com/pytorch/pytorch/pull/40782.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40794

Differential Revision: D22318045

Pulled By: malfet

fbshipit-source-id: a737ffd7cb8a6a9efb62b84378318f4c3800ad8f
2020-06-30 13:02:45 -07:00
peter
905c6730b7 Adding /FS for NVCC if /Zi is used (#39994)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/39989.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39994

Differential Revision: D22034956

Pulled By: malfet

fbshipit-source-id: b26cf188eba8b796ee6e39e6adbc3e2fbb07a53a
2020-06-13 12:16:12 -07:00
Xiang Gao
b3fac8af6b Initial support for building on Ampere GPU, CUDA 11, cuDNN 8 (#39277)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39277

This PR contains initial changes that makes PyTorch build with Ampere GPU, CUDA 11, and cuDNN 8.
TF32 related features will not be included in this PR.

Test Plan: Imported from OSS

Differential Revision: D21832814

Pulled By: malfet

fbshipit-source-id: 37f9c6827e0c26ae3e303580f666584230832d06
2020-06-02 10:03:42 -07:00
peter
bf53784e3c Treat cross-execution-space-call as errors for NVCC on Windows (#37302)
Summary:
On Windows, when you call those unsupported functions like `std::pow`, `std::isnan` or `std::isinf` in the device function and compile, a warning is thrown:
```
kernel.cu
kernel.cu(39): warning: calling a __host__ function from a __host__ __device__ function is not allowed

kernel.cu(42): warning: calling a __host__ function from a __host__ __device__ function is not allowed

kernel.cu(39): warning: calling a __host__ function("isnan<double> ") from a __host__ __device__ function("test_") is not allowed

kernel.cu(42): warning: calling a __host__ function("isinf<double> ") from a __host__ __device__ function("test_") is not allowed
```
However, those calls will lead to runtime errors, see https://github.com/pytorch/pytorch/pull/36749#issuecomment-619239788 and https://github.com/pytorch/pytorch/issues/31108.  So we should treat them as errors.
Previously, the situation is worse because the warnings are turned off by passing in `-w`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37302

Differential Revision: D21297207

Pulled By: ngimel

fbshipit-source-id: 822b8a98c10e54c38319674763b6681db21c1021
2020-04-29 01:52:52 -07:00
peter
c5d6f59ab1 Replacing EHa with EHsc (#37235)
Summary:
We should not rely on the async exceptions. Catching C++ only exception is more sensible and may get a boost in both space (1163 MB -> 1073 MB, 0.92x) and performance(51m -> 49m, 0.96x).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37235

Differential Revision: D21256918

Pulled By: ezyang

fbshipit-source-id: 572ee96f2e4c48ad13f83409e4e113483b3a457a
2020-04-28 08:20:37 -07:00
Jacob Zhong
e33c3e49d5 Fix hard-code cmake target (#37310)
Summary:
Fix https://github.com/pytorch/pytorch/issues/33928. Basically just move the dependency into a new imported target.

I'm not sure whether this modification will affect other parts, please test it throughly.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37310

Differential Revision: D21263066

Pulled By: ezyang

fbshipit-source-id: 7dc38f578d7e9bcb491ef5e122106fb66a33156f
2020-04-27 14:20:30 -07:00
Nikita Shulga
26ee0eee10 Use cufft_static_nocallback (#35813)
Summary:
Hattip to ngimel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35813

Test Plan: CI

Differential Revision: D20800789

Pulled By: malfet

fbshipit-source-id: a51cedfc7dfc68ac59d4f00f12eaff43cf1fdd7a
2020-04-01 13:43:49 -07:00
peter
45c9ed825a Formatting cmake (to lowercase without space for if/elseif/else/endif) (#35521)
Summary:
Running commands:
```bash
shopt -s globstar

sed -e 's/IF (/if(/g' -e 's/IF(/if(/g' -e 's/if (/if(/g' -e 's/ELSE (/else(/g' -e 's/ELSE(/else(/g' -e 's/else (/else(/g' -e 's/ENDif(/endif(/g' -e 's/ELSEif(/elseif(/g' -i CMakeLists.txt
sed -e 's/IF (/if(/g' -e 's/IF(/if(/g' -e 's/if (/if(/g' -e 's/ELSE (/else(/g' -e 's/ELSE(/else(/g' -e 's/else (/else(/g' -e 's/ENDif(/endif(/g' -e 's/ELSEif(/elseif(/g' -i caffe2/**/CMakeLists.txt
sed -e 's/IF (/if(/g' -e 's/IF(/if(/g' -e 's/if (/if(/g' -e 's/ELSE (/else(/g' -e 's/ELSE(/else(/g' -e 's/else (/else(/g' -e 's/ENDif(/endif(/g' -e 's/ELSEif(/elseif(/g' -i torch/**/CMakeLists.txt
sed -e 's/IF (/if(/g' -e 's/IF(/if(/g' -e 's/if (/if(/g' -e 's/ELSE (/else(/g' -e 's/ELSE(/else(/g' -e 's/else (/else(/g' -e 's/ENDif(/endif(/g' -e 's/ELSEif(/elseif(/g' -i c10/**/CMakeLists.txt
sed -e 's/IF (/if(/g' -e 's/IF(/if(/g' -e 's/if (/if(/g' -e 's/ELSE (/else(/g' -e 's/ELSE(/else(/g' -e 's/else (/else(/g' -e 's/ENDif(/endif(/g' -e 's/ELSEif(/elseif(/g' -i cmake/**/*.cmake
sed -e 's/IF (/if(/g' -e 's/IF(/if(/g' -e 's/if (/if(/g' -e 's/ELSE (/else(/g' -e 's/ELSE(/else(/g' -e 's/else (/else(/g' -e 's/ENDif(/endif(/g' -e 's/ELSEif(/elseif(/g' -i cmake/**/*.cmake.in
```
We may further convert all the commands into lowercase according to the following issue: 77543bde41.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35521

Differential Revision: D20704382

Pulled By: malfet

fbshipit-source-id: 42186b9b1660c34428ab7ceb8d3f7a0ced5d2e80
2020-03-27 14:25:17 -07:00
peter
f5383a213f Fix openmp detection with clang-cl (#35365)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35365

Differential Revision: D20653049

Pulled By: ezyang

fbshipit-source-id: 193c0d956b1aea72b3daa104ef49c4bf167a165a
2020-03-26 19:59:53 -07:00
Eli Uriegas
765c5b1c95 .circleci: Add CUDA 10.2 to CI (#34241)
Summary:
Basically a re-do of https://github.com/pytorch/pytorch/pull/33471

Should be safe to merge now that https://github.com/pytorch/pytorch/issues/34135 has been merged.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34241

Differential Revision: D20292711

Pulled By: seemethere

fbshipit-source-id: c508b5ef58f52aa3a263fd33b0373f31719fa0a4
2020-03-05 15:06:34 -08:00
Shen Li
b1fd7ba019 Revert D20169501: [pytorch][PR] .circleci: Add CUDA 10.2 to our CI pipeline
Test Plan: revert-hammer

Differential Revision:
D20169501

Original commit changeset: 43b7ca680200

fbshipit-source-id: dbeb0315ccc06b8e082d019cd1ffcd97e1d38e04
2020-03-03 08:15:36 -08:00
Eli Uriegas
bb4465f9f5 .circleci: Add CUDA 10.2 to our CI pipeline (#33471)
Summary:
Adds support for CUDA 10.2 builds on our nightly pipelines / regular test pipeliens.

Depends on https://github.com/pytorch/builder/pull/404
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33471

Test Plan: sandcastle_will_deliver

Reviewed By: ezyang

Differential Revision: D20169501

Pulled By: seemethere

fbshipit-source-id: 43b7ca680200a67fa88ad4f7b5a121954c9f089d
2020-03-02 15:50:48 -08:00