Summary:
*Context:* https://github.com/pytorch/pytorch/issues/53406 added a lint for trailing whitespace at the ends of lines. However, in order to pass FB-internal lints, that PR also had to normalize the trailing newlines in four of the files it touched. This PR adds an OSS lint to normalize trailing newlines.
The changes to the following files (made in 54847d0adb9be71be4979cead3d9d4c02160e4cd) are the only manually-written parts of this PR:
- `.github/workflows/lint.yml`
- `mypy-strict.ini`
- `tools/README.md`
- `tools/test/test_trailing_newlines.py`
- `tools/trailing_newlines.py`
I would have liked to make this just a shell one-liner like the other three similar lints, but nothing I could find quite fit the bill. Specifically, all the answers I tried from the following Stack Overflow questions were far too slow (at least a minute and a half to run on this entire repository):
- [How to detect file ends in newline?](https://stackoverflow.com/q/38746)
- [How do I find files that do not end with a newline/linefeed?](https://stackoverflow.com/q/4631068)
- [How to list all files in the Git index without newline at end of file](https://stackoverflow.com/q/27624800)
- [Linux - check if there is an empty line at the end of a file [duplicate]](https://stackoverflow.com/q/34943632)
- [git ensure newline at end of each file](https://stackoverflow.com/q/57770972)
To avoid giving false positives during the few days after this PR is merged, we should probably only merge it after https://github.com/pytorch/pytorch/issues/54967.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54737
Test Plan:
Running the shell script from the "Ensure correct trailing newlines" step in the `quick-checks` job of `.github/workflows/lint.yml` should print no output and exit in a fraction of a second with a status of 0. That was not the case prior to this PR, as shown by this failing GHA workflow run on an earlier draft of this PR:
- https://github.com/pytorch/pytorch/runs/2197446987?check_suite_focus=true
In contrast, this run (after correcting the trailing newlines in this PR) succeeded:
- https://github.com/pytorch/pytorch/pull/54737/checks?check_run_id=2197553241
To unit-test `tools/trailing_newlines.py` itself (this is run as part of our "Test tools" GitHub Actions workflow):
```
python tools/test/test_trailing_newlines.py
```
Reviewed By: malfet
Differential Revision: D27409736
Pulled By: samestep
fbshipit-source-id: 46f565227046b39f68349bbd5633105b2d2e9b19
Summary:
draft enable fast_nvcc.
* cleaned up some non-standard usages
* added fall-back to wrap_nvcc
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49773
Test Plan:
Configuration to enable fast nvcc:
- install and enable `ccache` but delete `.ccache/` folder before each build.
- `TORCH_CUDA_ARCH_LIST=6.0;6.1;6.2;7.0;7.5`
- Toggling `USE_FAST_NVCC=ON/OFF` cmake config and run `cmake --build` to verify the build time.
Initial statistic for a full compilation:
* `cmake --build . -- -j $(nproc)`:
- fast NVCC
```
real 48m55.706s
user 1559m14.218s
sys 318m41.138s
```
- normal NVCC:
```
real 43m38.723s
user 1470m28.131s
sys 90m46.879s
```
* `cmake --build . -- -j $(nproc/4)`:
- fast NVCC:
```
real 53m44.173s
user 1130m18.323s
sys 71m32.385s
```
- normal NVCC:
```
real 81m53.768s
user 858m45.402s
sys 61m15.539s
```
* Conclusion: fast NVCC doesn't provide too much gain when compiler is set to use full CPU utilization, in fact it is **even worse** because of the thread switcing.
initial statistic for partial recompile (edit .cu files)
* `cmake --build . -- -j $(nproc)`
- fast NVCC:
```
[2021-01-13 18:10:24] [ 86%] Building NVCC (Device) object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/torch_cuda_generated_BinaryMiscOpsKernels.cu.o
[2021-01-13 18:11:08] [ 86%] Linking CXX shared library ../lib/libtorch_cuda.so
```
- normal NVCC:
```
[2021-01-13 17:35:40] [ 86%] Building NVCC (Device) object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/torch_cuda_generated_BinaryMiscOpsKernels.cu.o
[2021-01-13 17:38:08] [ 86%] Linking CXX shared library ../lib/libtorch_cuda.so
```
* Conclusion: Effective compilation time for single CU file modification reduced from from 2min30sec to only 40sec when compiling multiple architecture. This shows **4X** gain in speed up using fast NVCC -- reaching the theoretical limit of 5X when compiling 5 gencode architecture at the same time.
Follow up PRs:
- should have better fallback mechanism to detect whether a build is supported by fast_nvcc or not instead of dryruning then fail with fallback.
- performance measurement instrumentation to measure what's the total compile time vs the parallel tasks critical path time.
- figure out why `-j $(nproc)` gives significant sys overhead (`sys 318m41.138s` vs `sys 90m46.879s`) over normal nvcc, guess this is context switching, but not exactly sure
Reviewed By: malfet
Differential Revision: D25692758
Pulled By: walterddr
fbshipit-source-id: c244d07b9b71f146e972b6b3682ca792b38c4457
Summary:
Check return code of `nvcc --version` and if it's not zero, print warning and mark CUDA as not found.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44236
Test Plan: Run `CUDA_NVCC_EXECUTABLE=/foo/bar cmake ../`
Reviewed By: ezyang
Differential Revision: D23552336
Pulled By: malfet
fbshipit-source-id: cf9387140a8cdbc8dab12fcc4bfaf55ae8e6a502
Summary:
As explained in https://github.com/pytorch/pytorch/issues/41922 using `if(NOT ${var})" is usually wrong and can lead to issues like https://github.com/pytorch/pytorch/issues/41922 where the condition is wrongly evaluated to FALSE instead of TRUE. Instead the unevaluated variable name should be used in all cases, see the CMake docu for details.
This fixes the `NOT ${var}` cases by using a simple regexp replacement. It seems `pybind11_PREFER_third_party` is the only variable really prone to causing an issue as all others are set. However due to CMake evaluating unquoted strings in `if` conditions as variable names I recommend to never use unquoted `${var}` in an if condition. A similar regexp based replacement could be done on the whole codebase but as that does a lot of changes I didn't include this now. Also `if(${var})` will likely lead to a parser error if `var` is unset instead of a wrong result
Fixes https://github.com/pytorch/pytorch/issues/41922
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41924
Reviewed By: seemethere
Differential Revision: D22700229
Pulled By: mrshenli
fbshipit-source-id: e2b3466039e4312887543c2e988270547a91c439
Summary:
This pulls the following merge requests from CMake upstream:
- https://gitlab.kitware.com/cmake/cmake/-/merge_requests/4979
- https://gitlab.kitware.com/cmake/cmake/-/merge_requests/4991
The above two merge requests improve the Ampere build:
- If `TORCH_CUDA_ARCH_LIST` is not set, it can now automatically pickup 8.0 as its part of its default value
- If `TORCH_CUDA_ARCH_LIST=Ampere`, it no longer fails with `Unknown CUDA Architecture Name Ampere in CUDA_SELECT_NVCC_ARCH_FLAGS`
Codes related to architecture < 3.5 are manually removed because PyTorch no longer supports it.
cc: ngimel ptrblck
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41133
Reviewed By: malfet
Differential Revision: D22540547
Pulled By: ezyang
fbshipit-source-id: 6e040f4054ef04f18ebb7513497905886a375632
Summary:
Fixes https://github.com/pytorch/pytorch/issues/26304
After this patch `build.ninja` entries for `.cu` files will contain a `depfile` variable pointing to a `.NVCC-depend` file containing dependencies (i.e., header files included directly or indirectly) of the `.cu` source file. Until now, those `.NVCC-depend` files were being transposed into `.cu.o.depend` files in CMake format. That did not work as intended because the `.cu.o` target file was declared to be dependent on the `.cu.o.depend` file itself, rather than its contents. In fact, Ninja lacks the functionality to process dependencies in the CMake format of those `.cu.o.depend` files.
This was tested on Linux as described in https://github.com/pytorch/pytorch/issues/26304#issuecomment-614667170
I have also verified that the original problem does not reproduce with Makefiles (i.e., when `ninja` is not present in the system) and that PyTorch still build successfully with Makefiles after this patch.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36938
Differential Revision: D21156042
Pulled By: ezyang
fbshipit-source-id: fda3aaa57207f4d6bf74d2f254fe45fb7fd90eec
Summary:
With fedora negativo17 repo, the cudnn headers are installed in /usr/include/cuda directory, along side with other cuda libraries.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31755
Differential Revision: D19697262
Pulled By: ezyang
fbshipit-source-id: be80d3467ffb90fd677d551f4403aea65a2ef5b3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30530
Switch some mentions of "C++11" in the docs to "C++14"
ghstack-source-id: 95812049
Test Plan: testinprod
Differential Revision: D18733733
fbshipit-source-id: b9d0490eb3f72bad974d134bbe9eb563f6bc8775
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24331
Currently our logs are something like 40M a pop. Turning off warnings and turning on verbose makefiles (to see the compile commands) reduces this to more like 8M. We could probably reduce log size more but verbose makefile is really useful and we'll keep it turned on for Windows.
Some findings:
1. Setting `CMAKE_VERBOSE_MAKEFILE` inside CMakelists.txt itself as suggested in https://github.com/ninja-build/ninja/issues/900#issuecomment-417917630 does not work on Windows. Setting `-DCMAKE_VERBOSE_MAKEFILE=1` does work (and we respect this environment variable.)
2. The high (`/W3`) warning level is by default on MSVC is due to cmake inserting this in the default flags. On recent versions of cmake, CMP0092 can be used to disable this flag in the default set. The string replace trick sort of works, but the standard snippet you'll find on the internet won't disable the flag from nvcc. I inspected the CUDA cmake code and verified it does respect CMP0092
3. `EHsc` is also in the default flags; this one cannot be suppressed via a policy. The string replace trick seems to work...
4. ... however, it seems nvcc implicitly inserts an `/EHs` after `-Xcompiler` specified flags, which means that if we add `/EHa` to our set of flags, you'll get a warning from nvcc. So we probably have to figure out how to exclude EHa from the nvcc flags set (EHs does seem to work fine.)
5. To suppress warnings in nvcc, you must BOTH pass `-w` and `-Xcompiler /w`. Individually these are not enough.
The patch applies these things; it also fixes a bug where nvcc verbose command printing doesn't work with `-GNinja`.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Differential Revision: D17131746
Pulled By: ezyang
fbshipit-source-id: fb142f8677072a5430664b28155373088f074c4b
Summary:
Currently they sit together with other code in cuda.cmake. This commit is the first step toward cleaning up cuDNN detection in our build system.
Another attempt to https://github.com/pytorch/pytorch/issues/24293, which breaks manywheels build because it does not handle `USE_STATIC_CUDNN` properly.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24938
Differential Revision: D17070920
Pulled By: ezyang
fbshipit-source-id: a4d017a3505c102d9c435a73ae62332e4336c52e
Summary:
Currently they sit together with other code in cuda.cmake. This commit
is the first step toward cleaning up cuDNN detection in our build system.
Another attempt to https://github.com/pytorch/pytorch/issues/24293, which breaks manywheels build because it does not handle `USE_STATIC_CUDNN`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24784
Differential Revision: D16914345
Pulled By: ezyang
fbshipit-source-id: fd261478c01d879dc770c1f1a56b17cc1a587be2
Summary:
```
[1/1424] Building NVCC (Device) object caffe2/CMakeFiles/torch.dir/operators/torch_generated_weighted_sample_op.cu.obj
CMake Warning (dev) at torch_generated_weighted_sample_op.cu.obj.Release.cmake:82 (set):
Syntax error in cmake code at
C:/Users/Ganzorig/pytorch/build/caffe2/CMakeFiles/torch.dir/operators/torch_generated_weighted_sample_op.cu.obj.Release.cmake:82
when parsing string
C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1/include;C:/Users/Ganzorig/pytorch/aten/src;C:/Users/Ganzorig/pytorch/build;C:/Users/Ganzorig/pytorch;C:/Users/Ganzorig/pytorch/cmake/../third_party/googletest/googlemock/include;C:/Users/Ganzorig/pytorch/cmake/../third_party/googletest/googletest/include;;C:/Users/Ganzorig/pytorch/third_party/protobuf/src;C:/Users/Ganzorig/pytorch/cmake/../third_party/benchmark/include;C:/Users/Ganzorig/pytorch/cmake/../third_party/eigen;C:/Users/Ganzorig/Anaconda3/envs/code/include;C:/Users/Ganzorig/Anaconda3/envs/code/lib/site-packages/numpy/core/include;C:/Users/Ganzorig/pytorch/cmake/../third_party/pybind11/include;C:/Users/Ganzorig/pytorch/cmake/../third_party/cub;C:/Users/Ganzorig/pytorch/build/caffe2/contrib/aten;C:/Users/Ganzorig/pytorch/third_party/onnx;C:/Users/Ganzorig/pytorch/build/third_party/onnx;C:/Users/Ganzorig/pytorch/third_party/foxi;C:/Users/Ganzorig/pytorch/build/third_party/foxi;C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1/include;C:/Users/Ganzorig/pytorch/caffe2/../torch/csrc/api;C:/Users/Ganzorig/pytorch/caffe2/../torch/csrc/api/include;C:/Program Files/NVIDIA Corporation/NvToolsExt/include;C:/Users/Ganzorig/pytorch/caffe2/aten/src/TH;C:/Users/Ganzorig/pytorch/build/caffe2/aten/src/TH;C:/Users/Ganzorig/pytorch/caffe2/../torch/../aten/src;C:/Users/Ganzorig/pytorch/build/caffe2/aten/src;C:/Users/Ganzorig/pytorch/build/aten/src;C:/Users/Ganzorig/pytorch/caffe2/../torch/../aten/src;C:/Users/Ganzorig/pytorch/build/caffe2/../aten/src;C:/Users/Ganzorig/pytorch/build/caffe2/../aten/src/ATen;C:/Users/Ganzorig/pytorch/build/aten/src;C:/Users/Ganzorig/pytorch/caffe2/../torch/csrc;C:/Users/Ganzorig/pytorch/caffe2/../torch/../third_party/miniz-2.0.8;C:/Users/Ganzorig/pytorch/caffe2/../torch/csrc/api;C:/Users/Ganzorig/pytorch/caffe2/../torch/csrc/api/include;C:/Users/Ganzorig/pytorch/build/caffe2/aten/src/TH;C:/Users/Ganzorig/pytorch/aten/src/TH;C:/Users/Ganzorig/pytorch/aten/src;C:/Users/Ganzorig/pytorch/build/caffe2/aten/src;C:/Users/Ganzorig/pytorch/build/aten/src;C:/Users/Ganzorig/pytorch/aten/src;C:/Users/Ganzorig/pytorch/aten/../third_party/catch/single_include;C:/Users/Ganzorig/pytorch/aten/src/ATen/..;C:/Users/Ganzorig/pytorch/build/caffe2/aten/src/ATen;C:/Users/Ganzorig/pytorch/third_party/miniz-2.0.8;C:/Users/Ganzorig/pytorch/caffe2/core/nomnigraph/include;C:/Users/Ganzorig/pytorch/caffe2/;C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1/include;C:/Users/Ganzorig/pytorch/build/caffe2/aten/src/TH;C:/Users/Ganzorig/pytorch/aten/src/TH;C:/Users/Ganzorig/pytorch/build/caffe2/aten/src/THC;C:/Users/Ganzorig/pytorch/aten/src/THC;C:/Users/Ganzorig/pytorch/aten/src/THCUNN;C:/Users/Ganzorig/pytorch/aten/src/ATen/cuda;C:/Users/Ganzorig/pytorch/build/caffe2/aten/src/TH;C:/Users/Ganzorig/pytorch/aten/src/TH;C:/Users/Ganzorig/pytorch/aten/src;C:/Users/Ganzorig/pytorch/build/caffe2/aten/src;C:/Users/Ganzorig/pytorch/build/aten/src;C:/Users/Ganzorig/pytorch/aten/src;C:/Users/Ganzorig/pytorch/aten/../third_party/catch/single_include;C:/Users/Ganzorig/pytorch/aten/src/ATen/..;C:/Users/Ganzorig/pytorch/build/caffe2/aten/src/ATen;C:/Users/Ganzorig/pytorch/third_party/protobuf/src;C:/Users/Ganzorig/pytorch/c10/../;C:/Users/Ganzorig/pytorch/build;C:/Users/Ganzorig/pytorch/third_party/cpuinfo/include;C:/Users/Ganzorig/pytorch/third_party/FP16/include;C:/Users/Ganzorig/pytorch/third_party/foxi;C:/Users/Ganzorig/pytorch/third_party/foxi;C:/Users/Ganzorig/pytorch/third_party/onnx;C:/Users/Ganzorig/pytorch/build/third_party/onnx;C:/Users/Ganzorig/pytorch/build/third_party/onnx;C:/Users/Ganzorig/pytorch/c10/cuda/../..;C:/Users/Ganzorig/pytorch/build;C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1/include;C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1/include;C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1/include;C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1\include;C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1/include
Invalid escape sequence \i
Policy CMP0010 is not set: Bad variable reference syntax is an error. Run
"cmake --help-policy CMP0010" for policy details. Use the cmake_policy
command to set the policy and suppress this warning.
This warning is for project developers. Use -Wno-dev to suppress it.
```
Compared to https://github.com/pytorch/pytorch/issues/24044 , this commit moves the fix up, and uses [bracket arguments](https://cmake.org/cmake/help/v3.12/manual/cmake-language.7.html#bracket-argument).
PR also sent to upstream: https://gitlab.kitware.com/cmake/cmake/merge_requests/3679
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24420
Differential Revision: D16914193
Pulled By: ezyang
fbshipit-source-id: 9f897cf4f607502a16dbd1045f2aedcb49c38da7
Summary:
This is a follow-up to gh-23408. No longer supported are any arches < 3.5 (numbers + 'Fermi' and 'Kepler+Tegra').
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24442
Differential Revision: D16889283
Pulled By: ezyang
fbshipit-source-id: 3c0c35d51b7ac7642d1be7ab4b0f260ac93b60c9
Summary:
The old behavior was to always use `sm_30`. The new behavior is:
- For building via a setup.py, check if `'arch'` is in `extra_compile_args`. If so, don't change anything.
- If `TORCH_CUDA_ARCH_LIST` is set, respect that (can be 1 or more arches)
- Otherwise, query device capability and use that.
To test this, for example on a machine with `torch` installed for py37:
```
$ git clone https://github.com/pytorch/extension-cpp.git
$ cd extension-cpp/cuda
$ python setup.py install
$ cuobjdump --list-elf build/lib.linux-x86_64-3.7/lltm_cuda.cpython-37m-x86_64-linux-gnu.so
ELF file 1: lltm.1.sm_61.cubin
```
Existing tests in `test_cpp_extension.py` for `load_inline` and for compiling via `setup.py` in test/cpp_extensions/ cover this.
Closes gh-18657
EDIT: some more tests:
```
from torch.utils.cpp_extension import load
lltm = load(name='lltm', sources=['lltm_cuda.cpp', 'lltm_cuda_kernel.cu'])
```
```
# with TORCH_CUDA_ARCH_LIST undefined or an empty string
$ cuobjdump --list-elf /tmp/torch_extensions/lltm/lltm.so
ELF file 1: lltm.1.sm_61.cubin
# with TORCH_CUDA_ARCH_LIST = "3.5 5.2 6.0 6.1 7.0+PTX"
$ cuobjdump --list-elf build/lib.linux-x86_64-3.7/lltm_cuda.cpython-37m-x86_64-linux-gnu.so
ELF file 1: lltm_cuda.cpython-37m-x86_64-linux-gnu.1.sm_35.cubin
ELF file 2: lltm_cuda.cpython-37m-x86_64-linux-gnu.2.sm_52.cubin
ELF file 3: lltm_cuda.cpython-37m-x86_64-linux-gnu.3.sm_60.cubin
ELF file 4: lltm_cuda.cpython-37m-x86_64-linux-gnu.4.sm_61.cubin
ELF file 5: lltm_cuda.cpython-37m-x86_64-linux-gnu.5.sm_70.cubin
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23408
Differential Revision: D16784110
Pulled By: soumith
fbshipit-source-id: 69ba09e235e4f906b959fd20322c69303240ee7e
Summary:
Which was added in https://github.com/pytorch/pytorch/issues/16412.
Also make some CUDNN_* CMake variables to be build options so as to avoid direct reading using `$ENV` from environment variables from CMake scripts.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24044
Differential Revision: D16783426
Pulled By: ezyang
fbshipit-source-id: cb196b0013418d172d0d36558995a437bd4a3986
Summary:
Fixes#11518
Upstream PR submitted at https://gitlab.kitware.com/cmake/cmake/merge_requests/2400
On some embedded platforms, the NVIDIA driver is verbose logging unexpected output to stdout.
One example is Drive PX2, where we see something like this whenever a CUDA program is run:
```
nvrm_gpu: Bug 200215060 workaround enabled.
```
This patch does a regex on the output of the architecture detection program to only capture architecture patterns.
It's more robust than before, but not fool-proof.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11851
Differential Revision: D9968362
Pulled By: soumith
fbshipit-source-id: b7952a87132ab05c724b287b76de263f1f671a0e
* Remove ATen's copy of FindCUDA
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Minor bugfix for updated FindCUDA.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Use cl.exe as the host compiler even when clcache.exe is set.
Upstream merge request at https://gitlab.kitware.com/cmake/cmake/merge_requests/1933
H/t peterjc123 who contributed the original version of this patch.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Include CMakeInitializeConfigs polyfill from ATen.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Tweak the regex so it actually works on Windows.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Update FindCUDA to cmake master as of 561238bb6f07a5ab31293928bd98f6f8911d8bc1
NB: I DID have to apply one local patch; it's the `include_guard` change. Should
be obvious next time you do an update.
Relevant commits:
commit 23119366e9d4e56e13c1fdec9dbff5e8f8c55ee5
Author: Edward Z. Yang <ezyang@fb.com>
Date: Wed Mar 28 11:33:56 2018 -0400
FindCUDA: Make nvcc configurable via CUDA_NVCC_EXECUTABLE env var
This is useful if, for example, you want ccache to be used
for nvcc. With the current behavior, cmake always picks up
/usr/local/cuda/bin/nvcc, even if there is a ccache nvcc
stub in the PATH. Allowing for CUDA_NVCC_EXECUTABLE lets
us work around the problem.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
commit e743fc8e9137692232f0220ac901f5a15cbd62cf
Author: Henry Fredrick Schreiner <henry.fredrick.schreiner@cern.ch>
Date: Thu Mar 15 15:30:50 2018 +0100
FindCUDA/select_compute_arch: Add support for CUDA as a language
Even though this is an internal module, we can still prepare it to
be used in another public-facing module outside of `FindCUDA`.
Issue: #16586
commit 193082a3c803a6418f0f1b5976dc34a91cf30805
Author: luz.paz <luzpaz@users.noreply.github.com>
Date: Thu Feb 8 06:27:21 2018 -0500
MAINT: Misc. typos
Found via `codespell -q 3 -I ../cmake-whitelist.txt`.
commit 9f74aaeb7d6649241c4a478410e87d092c462960
Author: Brad King <brad.king@kitware.com>
Date: Tue Jan 30 08:18:11 2018 -0500
FindCUDA: Fix regression in per-config flags
Changes in commit 48f7e2d300 (Unhardcode the CMAKE_CONFIGURATION_TYPES
values, 2017-11-27) accidentally left `CUDA_configuration_types`
undefined, but this is used in a few places to handle per-config flags.
Restore it.
Fixes: #17671
commit d91b2d9158cbe5d65bfcc8f7512503d7f226ad91
Author: luz.paz <luzpaz@users.noreply.github.com>
Date: Wed Jan 10 12:34:14 2018 -0500
MAINT: Misc. typos
Found via `codespell`
commit d08f3f551fa94b13a1d43338eaed68bcecb95cff
Merge: 1be22978e 1f4d7a071
Author: Brad King <brad.king@kitware.com>
Date: Wed Jan 10 15:34:57 2018 +0000
Merge topic 'unhardcode-configuration-types'
1f4d7a07 Help: Add references and backticks in LINK_FLAGS prop_tgt
48f7e2d3 Unhardcode the CMAKE_CONFIGURATION_TYPES values
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !1345
commit 5fbfa18fadf945963687cd95627c1bc62b68948a
Merge: bc88329e5 ff41a4b81
Author: Brad King <brad.king@kitware.com>
Date: Tue Jan 9 14:26:35 2018 +0000
Merge topic 'FindCUDA-deduplicate-c+std-host-flags'
ff41a4b8 FindCUDA: de-duplicates C++11 flag when propagating host flags.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !1628
commit bc88329e5ba7b1a14538f23f4fa223ac8d6d5895
Merge: 89d127463 fab1b432e
Author: Brad King <brad.king@kitware.com>
Date: Tue Jan 9 14:26:16 2018 +0000
Merge topic 'msvc2017-findcuda'
fab1b432 FindCUDA: Update to properly find MSVC 2017 compiler tools
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1631
commit 48f7e2d30000dc57c31d3e3ab81077950704a587
Author: Beren Minor <beren.minor+git@gmail.com>
Date: Mon Nov 27 19:22:11 2017 +0100
Unhardcode the CMAKE_CONFIGURATION_TYPES values
This removes duplicated code for per-config variable initialization by
providing a `cmake_initialize_per_config_variable(<PREFIX> <DOCSTRING>)`
function.
This function initializes a `<PREFIX>` cache variable from `<PREFIX>_INIT`
and unless the `CMAKE_NOT_USING_CONFIG_FLAGS` variable is defined, does
the same with `<PREFIX>_<CONFIG>` from `<PREFIX>_<CONFIG>_INIT` for every
`<CONFIG>` in `CMAKE_CONFIGURATION_TYPES` for multi-config generators or
`CMAKE_BUILD_TYPE` for single-config generators.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Polyfill CMakeInitializeConfigs
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Tweak condition for when to use bundled FindCUDA support.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Comment out include_guard.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
CMake 3.2 is required to properly track dependencies in projects imported as ExternalProject_Add (BUILD_BYPRODUCTS parameter).
Users on Ubuntu 14.04 LTS would need to install and use cmake3 package for configurations. Users of other popular distributions generally have a recent enough CMake package.
Summary:
This is in principle similar to #1612 and is tested on Windows 2017. CMake passes, although there are still bugs in the MSVC compiler that prevents cuda to compile properly.
The difference between this and #1612 is that this diff explicitly puts the CMake files into a separate folder and uses a MiscCheck.cmake chunk of code to test whether we need to include them. See README.txt for more details.
Closes https://github.com/caffe2/caffe2/pull/1727
Reviewed By: pietern
Differential Revision: D6693656
Pulled By: Yangqing
fbshipit-source-id: a74b0a1fde436d7bb2002a56affbc7bbb41ec621