pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Nikita Shulga	62c1e33fc9	[BE] Remove fast_nvcc tool (#96665 ) As of CUDA-11.4+ this functionality can be mimicked by passing [`--threads`](https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/#threads-number-t) option to CUDA compiler Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/96665 Approved by: https://github.com/atalman, https://github.com/PaliC	2023-03-14 03:17:31 +00:00
mantaionut	3beafc91d1	USE_FAST_NVCC Windows (#95206 ) USE_FAST_NVCC now works on Windows. Fixes #67100 Pull Request resolved: https://github.com/pytorch/pytorch/pull/95206 Approved by: https://github.com/ezyang	2023-03-06 15:04:24 +00:00
Eddie Yan	db8e91ef73	[CUDA] Split out compute capability 8.7 and 7.2 from others (#95803 ) Follow up of #95008 to avoid building Jetson compute capabilities unnecessarily, also adds missing 7.2. CC @ptrblck @malfet Pull Request resolved: https://github.com/pytorch/pytorch/pull/95803 Approved by: https://github.com/ezyang	2023-03-02 14:13:15 +00:00
eqy	cc39cd6938	[CUDA][CUBLAS] Explicitly link against `cuBLASLt` (#95094 ) An issue surfaced recently that revealed that we were never explicitly linking against `cuBLASLt`, this fixes it by linking explicitly rather than depending on linker magic. CC @ptrblck @ngimel Pull Request resolved: https://github.com/pytorch/pytorch/pull/95094 Approved by: https://github.com/malfet, https://github.com/ngimel, https://github.com/atalman	2023-02-24 21:44:32 +00:00
Eddie Yan	13ebffe088	[CUDA] `sm_87` / Jetson Orin support (#95008 ) Surfaced from #94438 CC @ptrblck @ngimel Pull Request resolved: https://github.com/pytorch/pytorch/pull/95008 Approved by: https://github.com/ezyang	2023-02-17 02:22:23 +00:00
cyy	5fa7120722	Simplify CMake CUDNN code (#91676 ) 1. Move CUDNN code to seperate module. 2. Merge CUDNN public and private targets into a single private target. There is no need to expose CUDNN dependency. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91676 Approved by: https://github.com/malfet	2023-02-08 01:06:10 +00:00
Eddie Yan	bac33ea8b6	[CUDA] Drop CUDA 10 support (#89582 ) CC @ptrblck @ngimel @malfet Pull Request resolved: https://github.com/pytorch/pytorch/pull/89582 Approved by: https://github.com/malfet, https://github.com/ngimel	2023-01-05 05:11:53 +00:00
Eddie Yan	a7420d2ccb	Hopper (`sm90`) support (#87736 ) Essentially a followup of #87436 CC @xwang233 @ptrblck Pull Request resolved: https://github.com/pytorch/pytorch/pull/87736 Approved by: https://github.com/xwang233, https://github.com/malfet	2022-11-09 01:49:50 +00:00
Greg Hogan	71fe069d98	ada lovelace (arch 8.9) support (#87436 ) changes required to be able to compile https://github.com/pytorch/vision and https://github.com/nvidia/apex for `sm_89` architecture Pull Request resolved: https://github.com/pytorch/pytorch/pull/87436 Approved by: https://github.com/ngimel	2022-10-24 21:25:36 +00:00
Sam Estep	5bcbbf5373	Lint trailing newlines (#54737 ) Summary: Context: https://github.com/pytorch/pytorch/issues/53406 added a lint for trailing whitespace at the ends of lines. However, in order to pass FB-internal lints, that PR also had to normalize the trailing newlines in four of the files it touched. This PR adds an OSS lint to normalize trailing newlines. The changes to the following files (made in 54847d0adb9be71be4979cead3d9d4c02160e4cd) are the only manually-written parts of this PR: - `.github/workflows/lint.yml` - `mypy-strict.ini` - `tools/README.md` - `tools/test/test_trailing_newlines.py` - `tools/trailing_newlines.py` I would have liked to make this just a shell one-liner like the other three similar lints, but nothing I could find quite fit the bill. Specifically, all the answers I tried from the following Stack Overflow questions were far too slow (at least a minute and a half to run on this entire repository): - [How to detect file ends in newline?](https://stackoverflow.com/q/38746) - [How do I find files that do not end with a newline/linefeed?](https://stackoverflow.com/q/4631068) - [How to list all files in the Git index without newline at end of file](https://stackoverflow.com/q/27624800) - [Linux - check if there is an empty line at the end of a file [duplicate]](https://stackoverflow.com/q/34943632) - [git ensure newline at end of each file](https://stackoverflow.com/q/57770972) To avoid giving false positives during the few days after this PR is merged, we should probably only merge it after https://github.com/pytorch/pytorch/issues/54967. Pull Request resolved: https://github.com/pytorch/pytorch/pull/54737 Test Plan: Running the shell script from the "Ensure correct trailing newlines" step in the `quick-checks` job of `.github/workflows/lint.yml` should print no output and exit in a fraction of a second with a status of 0. That was not the case prior to this PR, as shown by this failing GHA workflow run on an earlier draft of this PR: - https://github.com/pytorch/pytorch/runs/2197446987?check_suite_focus=true In contrast, this run (after correcting the trailing newlines in this PR) succeeded: - https://github.com/pytorch/pytorch/pull/54737/checks?check_run_id=2197553241 To unit-test `tools/trailing_newlines.py` itself (this is run as part of our "Test tools" GitHub Actions workflow): ``` python tools/test/test_trailing_newlines.py ``` Reviewed By: malfet Differential Revision: D27409736 Pulled By: samestep fbshipit-source-id: 46f565227046b39f68349bbd5633105b2d2e9b19	2021-03-30 13:09:52 -07:00
Rong Rong (AI Infra)	ebd142e94b	initial commit to enable fast_nvcc (#49773 ) Summary: draft enable fast_nvcc. * cleaned up some non-standard usages * added fall-back to wrap_nvcc Pull Request resolved: https://github.com/pytorch/pytorch/pull/49773 Test Plan: Configuration to enable fast nvcc: - install and enable `ccache` but delete `.ccache/` folder before each build. - `TORCH_CUDA_ARCH_LIST=6.0;6.1;6.2;7.0;7.5` - Toggling `USE_FAST_NVCC=ON/OFF` cmake config and run `cmake --build` to verify the build time. Initial statistic for a full compilation: * `cmake --build . -- -j $(nproc)`: - fast NVCC ``` real 48m55.706s user 1559m14.218s sys 318m41.138s ``` - normal NVCC: ``` real 43m38.723s user 1470m28.131s sys 90m46.879s ``` * `cmake --build . -- -j $(nproc/4)`: - fast NVCC: ``` real 53m44.173s user 1130m18.323s sys 71m32.385s ``` - normal NVCC: ``` real 81m53.768s user 858m45.402s sys 61m15.539s ``` * Conclusion: fast NVCC doesn't provide too much gain when compiler is set to use full CPU utilization, in fact it is even worse because of the thread switcing. initial statistic for partial recompile (edit .cu files) * `cmake --build . -- -j $(nproc)` - fast NVCC: ``` [2021-01-13 18:10:24] [ 86%] Building NVCC (Device) object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/torch_cuda_generated_BinaryMiscOpsKernels.cu.o [2021-01-13 18:11:08] [ 86%] Linking CXX shared library ../lib/libtorch_cuda.so ``` - normal NVCC: ``` [2021-01-13 17:35:40] [ 86%] Building NVCC (Device) object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/torch_cuda_generated_BinaryMiscOpsKernels.cu.o [2021-01-13 17:38:08] [ 86%] Linking CXX shared library ../lib/libtorch_cuda.so ``` * Conclusion: Effective compilation time for single CU file modification reduced from from 2min30sec to only 40sec when compiling multiple architecture. This shows 4X gain in speed up using fast NVCC -- reaching the theoretical limit of 5X when compiling 5 gencode architecture at the same time. Follow up PRs: - should have better fallback mechanism to detect whether a build is supported by fast_nvcc or not instead of dryruning then fail with fallback. - performance measurement instrumentation to measure what's the total compile time vs the parallel tasks critical path time. - figure out why `-j $(nproc)` gives significant sys overhead (`sys 318m41.138s` vs `sys 90m46.879s`) over normal nvcc, guess this is context switching, but not exactly sure Reviewed By: malfet Differential Revision: D25692758 Pulled By: walterddr fbshipit-source-id: c244d07b9b71f146e972b6b3682ca792b38c4457	2021-01-19 14:50:54 -08:00
Richard Barnes	30a8ba93b1	Remove a blacklist reference (#50477 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50477 See task for context Test Plan: Sandcastle+OSS tests Reviewed By: xush6528 Differential Revision: D25893906 fbshipit-source-id: c9b86d0292aa751597d75e8d1b53f99b99c924b9	2021-01-13 13:39:06 -08:00
Rong Rong	611080a118	[hot fix] cuda 11.0.x doesn't support sm86. (#47408 ) Summary: Bump condition check from >11.0 to >11.0.3 CMAKE 3.5 doesn't support VERSION_GREATER_EQUAL see [here](https://github.com/Dav1dde/glad/issues/134), so we might need to bump this again iv 11.0.4+ releases. should fix https://github.com/pytorch/pytorch/issues/47352 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47408 Reviewed By: glaringlee Differential Revision: D24759949 Pulled By: walterddr fbshipit-source-id: de384c7b150babaf799cce53ed198e5e931899da	2020-11-06 10:34:25 -08:00
Xiang Gao	0a15646e15	CUDA RTX30 series support (#45489 ) Summary: I also opened a PR on cmake upstream: https://gitlab.kitware.com/cmake/cmake/-/merge_requests/5292 Pull Request resolved: https://github.com/pytorch/pytorch/pull/45489 Reviewed By: zhangguanheng66 Differential Revision: D23997844 Pulled By: ezyang fbshipit-source-id: 4e7443dde9e70632ee429184f0d51cb9aa5a98b5	2020-09-29 18:19:23 -07:00
Nikita Shulga	0e64b02912	FindCUDA error handling (#44236 ) Summary: Check return code of `nvcc --version` and if it's not zero, print warning and mark CUDA as not found. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44236 Test Plan: Run `CUDA_NVCC_EXECUTABLE=/foo/bar cmake ../` Reviewed By: ezyang Differential Revision: D23552336 Pulled By: malfet fbshipit-source-id: cf9387140a8cdbc8dab12fcc4bfaf55ae8e6a502	2020-09-07 18:17:55 -07:00
Alexander Grund	a4b831a86a	Replace if(NOT ${var}) by if(NOT var) (#41924 ) Summary: As explained in https://github.com/pytorch/pytorch/issues/41922 using `if(NOT ${var})" is usually wrong and can lead to issues like https://github.com/pytorch/pytorch/issues/41922 where the condition is wrongly evaluated to FALSE instead of TRUE. Instead the unevaluated variable name should be used in all cases, see the CMake docu for details. This fixes the `NOT ${var}` cases by using a simple regexp replacement. It seems `pybind11_PREFER_third_party` is the only variable really prone to causing an issue as all others are set. However due to CMake evaluating unquoted strings in `if` conditions as variable names I recommend to never use unquoted `${var}` in an if condition. A similar regexp based replacement could be done on the whole codebase but as that does a lot of changes I didn't include this now. Also `if(${var})` will likely lead to a parser error if `var` is unset instead of a wrong result Fixes https://github.com/pytorch/pytorch/issues/41922 Pull Request resolved: https://github.com/pytorch/pytorch/pull/41924 Reviewed By: seemethere Differential Revision: D22700229 Pulled By: mrshenli fbshipit-source-id: e2b3466039e4312887543c2e988270547a91c439	2020-07-23 15:49:20 -07:00
Xiang Gao	8940a4e684	Pull upstream select_compute_arch from cmake for Ampere (#41133 ) Summary: This pulls the following merge requests from CMake upstream: - https://gitlab.kitware.com/cmake/cmake/-/merge_requests/4979 - https://gitlab.kitware.com/cmake/cmake/-/merge_requests/4991 The above two merge requests improve the Ampere build: - If `TORCH_CUDA_ARCH_LIST` is not set, it can now automatically pickup 8.0 as its part of its default value - If `TORCH_CUDA_ARCH_LIST=Ampere`, it no longer fails with `Unknown CUDA Architecture Name Ampere in CUDA_SELECT_NVCC_ARCH_FLAGS` Codes related to architecture < 3.5 are manually removed because PyTorch no longer supports it. cc: ngimel ptrblck Pull Request resolved: https://github.com/pytorch/pytorch/pull/41133 Reviewed By: malfet Differential Revision: D22540547 Pulled By: ezyang fbshipit-source-id: 6e040f4054ef04f18ebb7513497905886a375632	2020-07-15 12:53:32 -07:00
Edward Yang	1f82679311	Revert D21156042: [pytorch][PR] CMake/Ninja: fix dependencies for .cu files Test Plan: revert-hammer Differential Revision: D21156042 Original commit changeset: fda3aaa57207 fbshipit-source-id: 59b208d4dc7ab743876af3ed382477770526aa1a	2020-04-21 14:24:27 -07:00
Wojciech Baranowski	db84689c09	CMake/Ninja: fix dependencies for .cu files (#36938 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/26304 After this patch `build.ninja` entries for `.cu` files will contain a `depfile` variable pointing to a `.NVCC-depend` file containing dependencies (i.e., header files included directly or indirectly) of the `.cu` source file. Until now, those `.NVCC-depend` files were being transposed into `.cu.o.depend` files in CMake format. That did not work as intended because the `.cu.o` target file was declared to be dependent on the `.cu.o.depend` file itself, rather than its contents. In fact, Ninja lacks the functionality to process dependencies in the CMake format of those `.cu.o.depend` files. This was tested on Linux as described in https://github.com/pytorch/pytorch/issues/26304#issuecomment-614667170 I have also verified that the original problem does not reproduce with Makefiles (i.e., when `ninja` is not present in the system) and that PyTorch still build successfully with Makefiles after this patch. Pull Request resolved: https://github.com/pytorch/pytorch/pull/36938 Differential Revision: D21156042 Pulled By: ezyang fbshipit-source-id: fda3aaa57207f4d6bf74d2f254fe45fb7fd90eec	2020-04-21 09:43:48 -07:00
nihui	b69c685c4a	try to find cudnn header in /usr/include/cuda (#31755 ) Summary: With fedora negativo17 repo, the cudnn headers are installed in /usr/include/cuda directory, along side with other cuda libraries. Pull Request resolved: https://github.com/pytorch/pytorch/pull/31755 Differential Revision: D19697262 Pulled By: ezyang fbshipit-source-id: be80d3467ffb90fd677d551f4403aea65a2ef5b3	2020-02-04 14:10:32 -08:00
Sebastian Messmer	5554e5b793	Docs: c++11 -> c++14 (#30530 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30530 Switch some mentions of "C++11" in the docs to "C++14" ghstack-source-id: 95812049 Test Plan: testinprod Differential Revision: D18733733 fbshipit-source-id: b9d0490eb3f72bad974d134bbe9eb563f6bc8775	2019-12-17 14:09:02 -08:00
Sebastian Messmer	bc2e6d10fa	Back out "Revert D17908478: Switch PyTorch/Caffe2 to C++14" Summary: Original commit changeset: 775d2e29be0b Test Plan: CI Reviewed By: mruberry Differential Revision: D18775520 fbshipit-source-id: a350b3f86b66d97241f208786ee67e9a51172eac	2019-12-03 14:33:43 -08:00
Sebastian Messmer	a2ed50c920	Revert D17908478: Switch PyTorch/Caffe2 to C++14 Test Plan: revert-hammer Differential Revision: D17908478 Original commit changeset: 6e340024591e fbshipit-source-id: 775d2e29be0bc3a0db64f164c8960c44d4877d5d	2019-11-27 14:57:05 -08:00
Sebastian Messmer	d0acc9c085	Switch PyTorch/Caffe2 to C++14 (#30406 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30406 ghstack-source-id: 94642238 Test Plan: waitforsandcastle Differential Revision: D17908478 fbshipit-source-id: 6e340024591ec2c69521668022999df4a33b4ddb	2019-11-27 10:47:31 -08:00
Hong Xu	21d11e0b64	FindCUDA: Use find_program instead of find_path to find nvcc (#29160 ) Summary: Otherwise nvcc is not found if it is in env PATH but a non-standard location. Import from my patch for CMake: https://gitlab.kitware.com/cmake/cmake/merge_requests/3990 Although we currently do nvcc search in a Python script, it will be removed soon in https://github.com/pytorch/pytorch/issues/28617. Pull Request resolved: https://github.com/pytorch/pytorch/pull/29160 Differential Revision: D18326693 Pulled By: ezyang fbshipit-source-id: dc7ff3f6026f0655386ff685bce7372e2b061a4b	2019-11-05 08:51:35 -08:00
Edward Yang	c56464d13e	Turn off warnings on Windows CI. (#24331 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24331 Currently our logs are something like 40M a pop. Turning off warnings and turning on verbose makefiles (to see the compile commands) reduces this to more like 8M. We could probably reduce log size more but verbose makefile is really useful and we'll keep it turned on for Windows. Some findings: 1. Setting `CMAKE_VERBOSE_MAKEFILE` inside CMakelists.txt itself as suggested in https://github.com/ninja-build/ninja/issues/900#issuecomment-417917630 does not work on Windows. Setting `-DCMAKE_VERBOSE_MAKEFILE=1` does work (and we respect this environment variable.) 2. The high (`/W3`) warning level is by default on MSVC is due to cmake inserting this in the default flags. On recent versions of cmake, CMP0092 can be used to disable this flag in the default set. The string replace trick sort of works, but the standard snippet you'll find on the internet won't disable the flag from nvcc. I inspected the CUDA cmake code and verified it does respect CMP0092 3. `EHsc` is also in the default flags; this one cannot be suppressed via a policy. The string replace trick seems to work... 4. ... however, it seems nvcc implicitly inserts an `/EHs` after `-Xcompiler` specified flags, which means that if we add `/EHa` to our set of flags, you'll get a warning from nvcc. So we probably have to figure out how to exclude EHa from the nvcc flags set (EHs does seem to work fine.) 5. To suppress warnings in nvcc, you must BOTH pass `-w` and `-Xcompiler /w`. Individually these are not enough. The patch applies these things; it also fixes a bug where nvcc verbose command printing doesn't work with `-GNinja`. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D17131746 Pulled By: ezyang fbshipit-source-id: fb142f8677072a5430664b28155373088f074c4b	2019-08-30 07:11:07 -07:00
Hong Xu	92750acb88	Move the detection of cuDNN to FindCUDNN.cmake (#24938 ) Summary: Currently they sit together with other code in cuda.cmake. This commit is the first step toward cleaning up cuDNN detection in our build system. Another attempt to https://github.com/pytorch/pytorch/issues/24293, which breaks manywheels build because it does not handle `USE_STATIC_CUDNN` properly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/24938 Differential Revision: D17070920 Pulled By: ezyang fbshipit-source-id: a4d017a3505c102d9c435a73ae62332e4336c52e	2019-08-27 06:51:52 -07:00
Edward Yang	907f5020c3	Revert D16914345: [pytorch][PR] Move the detection of cuDNN to FindCUDNN.cmake Differential Revision: D16914345 Original commit changeset: fd261478c01d fbshipit-source-id: b933ad7ed49028ab9ac6976c3ae768132dc9bacb	2019-08-20 14:23:12 -07:00
Hong Xu	6ce6939be9	Move the detection of cuDNN to FindCUDNN.cmake (#24784 ) Summary: Currently they sit together with other code in cuda.cmake. This commit is the first step toward cleaning up cuDNN detection in our build system. Another attempt to https://github.com/pytorch/pytorch/issues/24293, which breaks manywheels build because it does not handle `USE_STATIC_CUDNN`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/24784 Differential Revision: D16914345 Pulled By: ezyang fbshipit-source-id: fd261478c01d879dc770c1f1a56b17cc1a587be2	2019-08-20 01:55:46 -07:00
peterjc123	d9b4149e99	Fix cmake backslash syntax error on Windows. (#24420 ) Summary: ``` [1/1424] Building NVCC (Device) object caffe2/CMakeFiles/torch.dir/operators/torch_generated_weighted_sample_op.cu.obj CMake Warning (dev) at torch_generated_weighted_sample_op.cu.obj.Release.cmake:82 (set): Syntax error in cmake code at C:/Users/Ganzorig/pytorch/build/caffe2/CMakeFiles/torch.dir/operators/torch_generated_weighted_sample_op.cu.obj.Release.cmake:82 when parsing string C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1/include;C:/Users/Ganzorig/pytorch/aten/src;C:/Users/Ganzorig/pytorch/build;C:/Users/Ganzorig/pytorch;C:/Users/Ganzorig/pytorch/cmake/../third_party/googletest/googlemock/include;C:/Users/Ganzorig/pytorch/cmake/../third_party/googletest/googletest/include;;C:/Users/Ganzorig/pytorch/third_party/protobuf/src;C:/Users/Ganzorig/pytorch/cmake/../third_party/benchmark/include;C:/Users/Ganzorig/pytorch/cmake/../third_party/eigen;C:/Users/Ganzorig/Anaconda3/envs/code/include;C:/Users/Ganzorig/Anaconda3/envs/code/lib/site-packages/numpy/core/include;C:/Users/Ganzorig/pytorch/cmake/../third_party/pybind11/include;C:/Users/Ganzorig/pytorch/cmake/../third_party/cub;C:/Users/Ganzorig/pytorch/build/caffe2/contrib/aten;C:/Users/Ganzorig/pytorch/third_party/onnx;C:/Users/Ganzorig/pytorch/build/third_party/onnx;C:/Users/Ganzorig/pytorch/third_party/foxi;C:/Users/Ganzorig/pytorch/build/third_party/foxi;C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1/include;C:/Users/Ganzorig/pytorch/caffe2/../torch/csrc/api;C:/Users/Ganzorig/pytorch/caffe2/../torch/csrc/api/include;C:/Program Files/NVIDIA Corporation/NvToolsExt/include;C:/Users/Ganzorig/pytorch/caffe2/aten/src/TH;C:/Users/Ganzorig/pytorch/build/caffe2/aten/src/TH;C:/Users/Ganzorig/pytorch/caffe2/../torch/../aten/src;C:/Users/Ganzorig/pytorch/build/caffe2/aten/src;C:/Users/Ganzorig/pytorch/build/aten/src;C:/Users/Ganzorig/pytorch/caffe2/../torch/../aten/src;C:/Users/Ganzorig/pytorch/build/caffe2/../aten/src;C:/Users/Ganzorig/pytorch/build/caffe2/../aten/src/ATen;C:/Users/Ganzorig/pytorch/build/aten/src;C:/Users/Ganzorig/pytorch/caffe2/../torch/csrc;C:/Users/Ganzorig/pytorch/caffe2/../torch/../third_party/miniz-2.0.8;C:/Users/Ganzorig/pytorch/caffe2/../torch/csrc/api;C:/Users/Ganzorig/pytorch/caffe2/../torch/csrc/api/include;C:/Users/Ganzorig/pytorch/build/caffe2/aten/src/TH;C:/Users/Ganzorig/pytorch/aten/src/TH;C:/Users/Ganzorig/pytorch/aten/src;C:/Users/Ganzorig/pytorch/build/caffe2/aten/src;C:/Users/Ganzorig/pytorch/build/aten/src;C:/Users/Ganzorig/pytorch/aten/src;C:/Users/Ganzorig/pytorch/aten/../third_party/catch/single_include;C:/Users/Ganzorig/pytorch/aten/src/ATen/..;C:/Users/Ganzorig/pytorch/build/caffe2/aten/src/ATen;C:/Users/Ganzorig/pytorch/third_party/miniz-2.0.8;C:/Users/Ganzorig/pytorch/caffe2/core/nomnigraph/include;C:/Users/Ganzorig/pytorch/caffe2/;C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1/include;C:/Users/Ganzorig/pytorch/build/caffe2/aten/src/TH;C:/Users/Ganzorig/pytorch/aten/src/TH;C:/Users/Ganzorig/pytorch/build/caffe2/aten/src/THC;C:/Users/Ganzorig/pytorch/aten/src/THC;C:/Users/Ganzorig/pytorch/aten/src/THCUNN;C:/Users/Ganzorig/pytorch/aten/src/ATen/cuda;C:/Users/Ganzorig/pytorch/build/caffe2/aten/src/TH;C:/Users/Ganzorig/pytorch/aten/src/TH;C:/Users/Ganzorig/pytorch/aten/src;C:/Users/Ganzorig/pytorch/build/caffe2/aten/src;C:/Users/Ganzorig/pytorch/build/aten/src;C:/Users/Ganzorig/pytorch/aten/src;C:/Users/Ganzorig/pytorch/aten/../third_party/catch/single_include;C:/Users/Ganzorig/pytorch/aten/src/ATen/..;C:/Users/Ganzorig/pytorch/build/caffe2/aten/src/ATen;C:/Users/Ganzorig/pytorch/third_party/protobuf/src;C:/Users/Ganzorig/pytorch/c10/../;C:/Users/Ganzorig/pytorch/build;C:/Users/Ganzorig/pytorch/third_party/cpuinfo/include;C:/Users/Ganzorig/pytorch/third_party/FP16/include;C:/Users/Ganzorig/pytorch/third_party/foxi;C:/Users/Ganzorig/pytorch/third_party/foxi;C:/Users/Ganzorig/pytorch/third_party/onnx;C:/Users/Ganzorig/pytorch/build/third_party/onnx;C:/Users/Ganzorig/pytorch/build/third_party/onnx;C:/Users/Ganzorig/pytorch/c10/cuda/../..;C:/Users/Ganzorig/pytorch/build;C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1/include;C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1/include;C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1/include;C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1\include;C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1/include Invalid escape sequence \i Policy CMP0010 is not set: Bad variable reference syntax is an error. Run "cmake --help-policy CMP0010" for policy details. Use the cmake_policy command to set the policy and suppress this warning. This warning is for project developers. Use -Wno-dev to suppress it. ``` Compared to https://github.com/pytorch/pytorch/issues/24044 , this commit moves the fix up, and uses [bracket arguments](https://cmake.org/cmake/help/v3.12/manual/cmake-language.7.html#bracket-argument). PR also sent to upstream: https://gitlab.kitware.com/cmake/cmake/merge_requests/3679 Pull Request resolved: https://github.com/pytorch/pytorch/pull/24420 Differential Revision: D16914193 Pulled By: ezyang fbshipit-source-id: 9f897cf4f607502a16dbd1045f2aedcb49c38da7	2019-08-20 01:25:20 -07:00
Ralf Gommers	92c63d90e8	Remove support for old architectures in cpp_extension and CMake (#24442 ) Summary: This is a follow-up to gh-23408. No longer supported are any arches < 3.5 (numbers + 'Fermi' and 'Kepler+Tegra'). Pull Request resolved: https://github.com/pytorch/pytorch/pull/24442 Differential Revision: D16889283 Pulled By: ezyang fbshipit-source-id: 3c0c35d51b7ac7642d1be7ab4b0f260ac93b60c9	2019-08-19 06:23:33 -07:00
Edward Yang	c676db230d	Revert D16834297: Move the search of cuDNN files to FindCUDNN.cmake. Differential Revision: D16834297 Original commit changeset: ec2c0ba0c659 fbshipit-source-id: 028a727f4baaaf4439c7ca17c999bba7ea6d419f	2019-08-16 08:30:21 -07:00
Hong Xu	482607c16c	Move the search of cuDNN files to FindCUDNN.cmake. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24293 Test Plan: Imported from OSS Differential Revision: D16834297 Pulled By: ezyang fbshipit-source-id: ec2c0ba0c659d82fffd40d52ae723934377aa49c	2019-08-16 06:07:25 -07:00
Ralf Gommers	cd20773701	Set CUDA arch correctly when building with torch.utils.cpp_extension (#23408 ) Summary: The old behavior was to always use `sm_30`. The new behavior is: - For building via a setup.py, check if `'arch'` is in `extra_compile_args`. If so, don't change anything. - If `TORCH_CUDA_ARCH_LIST` is set, respect that (can be 1 or more arches) - Otherwise, query device capability and use that. To test this, for example on a machine with `torch` installed for py37: ``` $ git clone https://github.com/pytorch/extension-cpp.git $ cd extension-cpp/cuda $ python setup.py install $ cuobjdump --list-elf build/lib.linux-x86_64-3.7/lltm_cuda.cpython-37m-x86_64-linux-gnu.so ELF file 1: lltm.1.sm_61.cubin ``` Existing tests in `test_cpp_extension.py` for `load_inline` and for compiling via `setup.py` in test/cpp_extensions/ cover this. Closes gh-18657 EDIT: some more tests: ``` from torch.utils.cpp_extension import load lltm = load(name='lltm', sources=['lltm_cuda.cpp', 'lltm_cuda_kernel.cu']) ``` ``` # with TORCH_CUDA_ARCH_LIST undefined or an empty string $ cuobjdump --list-elf /tmp/torch_extensions/lltm/lltm.so ELF file 1: lltm.1.sm_61.cubin # with TORCH_CUDA_ARCH_LIST = "3.5 5.2 6.0 6.1 7.0+PTX" $ cuobjdump --list-elf build/lib.linux-x86_64-3.7/lltm_cuda.cpython-37m-x86_64-linux-gnu.so ELF file 1: lltm_cuda.cpython-37m-x86_64-linux-gnu.1.sm_35.cubin ELF file 2: lltm_cuda.cpython-37m-x86_64-linux-gnu.2.sm_52.cubin ELF file 3: lltm_cuda.cpython-37m-x86_64-linux-gnu.3.sm_60.cubin ELF file 4: lltm_cuda.cpython-37m-x86_64-linux-gnu.4.sm_61.cubin ELF file 5: lltm_cuda.cpython-37m-x86_64-linux-gnu.5.sm_70.cubin ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/23408 Differential Revision: D16784110 Pulled By: soumith fbshipit-source-id: 69ba09e235e4f906b959fd20322c69303240ee7e	2019-08-15 15:25:15 -07:00
Hong Xu	0b1fee0819	Remove escape_path in our build system. (#24044 ) Summary: Which was added in https://github.com/pytorch/pytorch/issues/16412. Also make some CUDNN_* CMake variables to be build options so as to avoid direct reading using `$ENV` from environment variables from CMake scripts. Pull Request resolved: https://github.com/pytorch/pytorch/pull/24044 Differential Revision: D16783426 Pulled By: ezyang fbshipit-source-id: cb196b0013418d172d0d36558995a437bd4a3986	2019-08-13 20:38:19 -07:00
peterjc123	20a5aa9670	Sync FindCUDA/select_computer_arch.cmake from upstream (#19392 ) Summary: 1. Fixes auto detection for Turing cards. 2. Adds Turing Support Pull Request resolved: https://github.com/pytorch/pytorch/pull/19392 Differential Revision: D14996142 Pulled By: soumith fbshipit-source-id: 3cd45c58212cf3db96e5fa19b07d9f1b59a1666a	2019-04-18 07:03:19 -07:00
SsnL	13422fca32	Add torch.backends.openmp.is_available(); fix some cmake messages (#16425 ) Summary: 1. add `torch.backends.openmp.is_available()` 2. Improve various `cmake` outputs 3. Fix LDFLAGS not respected by `caffe2_pybind11_state_*` targets 4. Fix `MKL` warning message, and QUIET flag. 5. Fix various typos Pull Request resolved: https://github.com/pytorch/pytorch/pull/16425 Differential Revision: D13903395 Pulled By: soumith fbshipit-source-id: d15c5d46f53e1ff1c27fca2887b9d23d0bd85b4d	2019-01-31 16:15:46 -08:00
peter	a7afd133f5	Sync FindCUDA.cmake with upstream cmake repo (#11880 ) Summary: Upstream PR: https://gitlab.kitware.com/cmake/cmake/merge_requests/2391/diffs Pull Request resolved: https://github.com/pytorch/pytorch/pull/11880 Differential Revision: D9989119 Pulled By: soumith fbshipit-source-id: 66e87367127975a5f1619fe447f74e76f101b503	2018-09-21 06:58:17 -07:00
Soumith Chintala	0927386890	Workaround CUDA logging on some embedded platforms (#11851 ) Summary: Fixes #11518 Upstream PR submitted at https://gitlab.kitware.com/cmake/cmake/merge_requests/2400 On some embedded platforms, the NVIDIA driver is verbose logging unexpected output to stdout. One example is Drive PX2, where we see something like this whenever a CUDA program is run: ``` nvrm_gpu: Bug 200215060 workaround enabled. ``` This patch does a regex on the output of the architecture detection program to only capture architecture patterns. It's more robust than before, but not fool-proof. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11851 Differential Revision: D9968362 Pulled By: soumith fbshipit-source-id: b7952a87132ab05c724b287b76de263f1f671a0e	2018-09-20 09:26:00 -07:00
peter	10c29c8970	Fix CUDA 8 build on Windows (#11729 ) Summary: Tested via https://github.com/pytorch/pytorch/pull/11374. Upstream PR: https://gitlab.kitware.com/cmake/cmake/merge_requests/2391 Pull Request resolved: https://github.com/pytorch/pytorch/pull/11729 Differential Revision: D9847807 Pulled By: orionr fbshipit-source-id: 69af3e6c5bba0abcbc8830495e867a0b1b399c22	2018-09-16 08:09:24 -07:00
Syed Tousif Ahmed	b7ecf035dc	Updates FindCUDA.cmake to 3.12.2 upstream version (#11406 ) Summary: This PR is just a copy-paste of the upstream FindCUDA.cmake. Since, cublas_device is deprecated in CUDA >= 9.2, this change is necessary for build. Related: https://gitlab.kitware.com/cmake/cmake/merge_requests/2298 Pull Request resolved: https://github.com/pytorch/pytorch/pull/11406 Differential Revision: D9735563 Pulled By: ezyang fbshipit-source-id: c74d86ced7cc485cb2233f9066ce23e921832c30	2018-09-08 23:10:32 -07:00
Tongzhou Wang	73b92472d2	[README.md] Use GitLab URL for CMake (#8799 ) * update to GitLab url * use GitLab url for upstream CMake	2018-06-22 16:51:35 -04:00
Tongzhou Wang	675b579bf9	cmake wrapper (#8797 )	2018-06-22 15:29:25 -04:00
Tongzhou Wang	a4bd4f6c6f	Fix -g not passed to nvcc when DEBUG=1 (#8407 ) * Fix -g not passed to nvcc when DEBUG=1 * blacklist -Werror * filter CMAKE_CXX_FLAGS too * restore to space-delimited string before ending macro	2018-06-14 12:36:50 -04:00
Will Feng	2fdc00e41c	Use sccache for Windows build (#7331 )	2018-05-07 14:42:59 -04:00
Edward Z. Yang	73ab15d388	Change ATen to use Caffe2/cmake upstream FindCUDA (#6240 ) * Remove ATen's copy of FindCUDA Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Minor bugfix for updated FindCUDA. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Use cl.exe as the host compiler even when clcache.exe is set. Upstream merge request at https://gitlab.kitware.com/cmake/cmake/merge_requests/1933 H/t peterjc123 who contributed the original version of this patch. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Include CMakeInitializeConfigs polyfill from ATen. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Tweak the regex so it actually works on Windows. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2018-04-04 23:26:57 -04:00
Edward Z. Yang	ed9952dd25	Update FindCUDA to cmake master as of 561238bb6f07a5ab31293928bd98f6f… (#6241 ) * Update FindCUDA to cmake master as of 561238bb6f07a5ab31293928bd98f6f8911d8bc1 NB: I DID have to apply one local patch; it's the `include_guard` change. Should be obvious next time you do an update. Relevant commits: commit 23119366e9d4e56e13c1fdec9dbff5e8f8c55ee5 Author: Edward Z. Yang <ezyang@fb.com> Date: Wed Mar 28 11:33:56 2018 -0400 FindCUDA: Make nvcc configurable via CUDA_NVCC_EXECUTABLE env var This is useful if, for example, you want ccache to be used for nvcc. With the current behavior, cmake always picks up /usr/local/cuda/bin/nvcc, even if there is a ccache nvcc stub in the PATH. Allowing for CUDA_NVCC_EXECUTABLE lets us work around the problem. Signed-off-by: Edward Z. Yang <ezyang@fb.com> commit e743fc8e9137692232f0220ac901f5a15cbd62cf Author: Henry Fredrick Schreiner <henry.fredrick.schreiner@cern.ch> Date: Thu Mar 15 15:30:50 2018 +0100 FindCUDA/select_compute_arch: Add support for CUDA as a language Even though this is an internal module, we can still prepare it to be used in another public-facing module outside of `FindCUDA`. Issue: #16586 commit 193082a3c803a6418f0f1b5976dc34a91cf30805 Author: luz.paz <luzpaz@users.noreply.github.com> Date: Thu Feb 8 06:27:21 2018 -0500 MAINT: Misc. typos Found via `codespell -q 3 -I ../cmake-whitelist.txt`. commit 9f74aaeb7d6649241c4a478410e87d092c462960 Author: Brad King <brad.king@kitware.com> Date: Tue Jan 30 08:18:11 2018 -0500 FindCUDA: Fix regression in per-config flags Changes in commit 48f7e2d300 (Unhardcode the CMAKE_CONFIGURATION_TYPES values, 2017-11-27) accidentally left `CUDA_configuration_types` undefined, but this is used in a few places to handle per-config flags. Restore it. Fixes: #17671 commit d91b2d9158cbe5d65bfcc8f7512503d7f226ad91 Author: luz.paz <luzpaz@users.noreply.github.com> Date: Wed Jan 10 12:34:14 2018 -0500 MAINT: Misc. typos Found via `codespell` commit d08f3f551fa94b13a1d43338eaed68bcecb95cff Merge: 1be22978e 1f4d7a071 Author: Brad King <brad.king@kitware.com> Date: Wed Jan 10 15:34:57 2018 +0000 Merge topic 'unhardcode-configuration-types' `1f4d7a07` Help: Add references and backticks in LINK_FLAGS prop_tgt 48f7e2d3 Unhardcode the CMAKE_CONFIGURATION_TYPES values Acked-by: Kitware Robot <kwrobot@kitware.com> Merge-request: !1345 commit 5fbfa18fadf945963687cd95627c1bc62b68948a Merge: bc88329e5 ff41a4b81 Author: Brad King <brad.king@kitware.com> Date: Tue Jan 9 14:26:35 2018 +0000 Merge topic 'FindCUDA-deduplicate-c+std-host-flags' ff41a4b8 FindCUDA: de-duplicates C++11 flag when propagating host flags. Acked-by: Kitware Robot <kwrobot@kitware.com> Merge-request: !1628 commit bc88329e5ba7b1a14538f23f4fa223ac8d6d5895 Merge: 89d127463 fab1b432e Author: Brad King <brad.king@kitware.com> Date: Tue Jan 9 14:26:16 2018 +0000 Merge topic 'msvc2017-findcuda' fab1b432 FindCUDA: Update to properly find MSVC 2017 compiler tools Acked-by: Kitware Robot <kwrobot@kitware.com> Acked-by: Robert Maynard <robert.maynard@kitware.com> Merge-request: !1631 commit 48f7e2d30000dc57c31d3e3ab81077950704a587 Author: Beren Minor <beren.minor+git@gmail.com> Date: Mon Nov 27 19:22:11 2017 +0100 Unhardcode the CMAKE_CONFIGURATION_TYPES values This removes duplicated code for per-config variable initialization by providing a `cmake_initialize_per_config_variable(<PREFIX> <DOCSTRING>)` function. This function initializes a `<PREFIX>` cache variable from `<PREFIX>_INIT` and unless the `CMAKE_NOT_USING_CONFIG_FLAGS` variable is defined, does the same with `<PREFIX>_<CONFIG>` from `<PREFIX>_<CONFIG>_INIT` for every `<CONFIG>` in `CMAKE_CONFIGURATION_TYPES` for multi-config generators or `CMAKE_BUILD_TYPE` for single-config generators. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Polyfill CMakeInitializeConfigs Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Tweak condition for when to use bundled FindCUDA support. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Comment out include_guard. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2018-04-04 17:04:21 -04:00
Marat Dukhan	c9cc514df4	Bump minimum CMake version to 3.2 CMake 3.2 is required to properly track dependencies in projects imported as ExternalProject_Add (BUILD_BYPRODUCTS parameter). Users on Ubuntu 14.04 LTS would need to install and use cmake3 package for configurations. Users of other popular distributions generally have a recent enough CMake package.	2018-03-06 19:57:48 -08:00
Yangqing Jia	ab638020f8	Backport FindCUDA functionalities from CMake Summary: This is in principle similar to #1612 and is tested on Windows 2017. CMake passes, although there are still bugs in the MSVC compiler that prevents cuda to compile properly. The difference between this and #1612 is that this diff explicitly puts the CMake files into a separate folder and uses a MiscCheck.cmake chunk of code to test whether we need to include them. See README.txt for more details. Closes https://github.com/caffe2/caffe2/pull/1727 Reviewed By: pietern Differential Revision: D6693656 Pulled By: Yangqing fbshipit-source-id: a74b0a1fde436d7bb2002a56affbc7bbb41ec621	2018-01-10 16:36:03 -08:00

49 Commits