pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Hong Xu	a8edc2b5d2	Add sanity checks for NCCL detection. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22926 Differential Revision: D16546369 Pulled By: colesbury fbshipit-source-id: 56f7ef4476e586dee19366fdb720085d1c2f2027	2019-07-29 13:47:05 -07:00
Hong Xu	09ba4df031	Whether MKLDNN should be built under native arch should respect USE_NATIVE_ARCH (#23445 ) Summary: Currently there is no way to build MKLDNN more optimized than sse4. This commit let MKLDNN build respect USE_NATIVE_ARCH. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23445 Differential Revision: D16542275 Pulled By: ezyang fbshipit-source-id: 550976531d6a52db9128c0e3d4589a33715feee2	2019-07-29 08:13:56 -07:00
Gu, Jinghui	1dd4d55565	Improve FindMKLDNN.cmake to avoid binary compatibility issue in MKL-DNN (#23292 ) Summary: Illegal instruction is encountered in pre-built package in MKL-DNN. https://github.com/pytorch/pytorch/issues/23231 To avoid such binary compatibility issue, the HostOpts option in MKL-DNN is disabled in order to build MKL-DNN for generic arch. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23292 Differential Revision: D16488773 Pulled By: soumith fbshipit-source-id: 9e13c76fb9cb9338103cb767d7463c10891d294a	2019-07-25 04:42:26 -07:00
Hong Xu	60c46dd4df	Let CMake handle NCCL detection instead of our handcrafted Python script. (#22930 ) Summary: --- How does the current code subsume all detections in the deleted `nccl.py`? - The dependency of `USE_NCCL` on the OS and `USE_CUDA` is handled as dependency options in `CMakeLists.txt`. - The main NCCL detection happens in [FindNCCL.cmake](`8377d4b32c/cmake/Modules/FindNCCL.cmake`), which is called by [nccl.cmake](`8377d4b32c/cmake/External/nccl.cmake`). When `USE_SYSTEM_NCCL` is false, the previous Python code defer the detection to `find_package(NCCL)`. The change in `nccl.cmake` retains this. - `USE_STATIC_NCCL` in the previous Python code simply changes the name of the detected library. This is done in `IF (USE_STATIC_NCCL)`. - Now we only need to look at how the lines below line 20 in `nccl.cmake` are subsumed. These lines list paths to header and library directories that NCCL headers and libraries may reside in and try to search these directories for the key header and library files in turn. These are done by `find_path` for headers and `find_library` for the library files in `FindNCCL.cmake`. * The call of [find_path](https://cmake.org/cmake/help/v3.8/command/find_path.html) (Search for `NO_DEFAULT_PATH` in the link) by default searches for headers in `<prefix>/include` for each `<prefix>` in `CMAKE_PREFIX_PATH` and `CMAKE_SYSTEM_PREFIX_PATH`. Like the Python code, this commit sets `CMAKE_PREFIX_PATH` to search for `<prefix>` in `NCCL_ROOT_DIR` and home to CUDA. `CMAKE_SYSTEM_PREFIX_PATH` includes the standard directories such as `/usr/local` and `/usr`. `NCCL_INCLUDE_DIR` is also specifically handled. * Similarly, the call of [find_library](https://cmake.org/cmake/help/v3.8/command/find_library.html) (Search for `NO_DEFAULT_PATH` in the link) by default searches for libraries in directories including `<prefix>/lib` for each `<prefix>` in `CMAKE_PREFIX_PATH` and `CMAKE_SYSTEM_PREFIX_PATH`. But it also handles the edge cases intended to be solved in the Python code more properly: - It only searches for `<prefix>/lib64` (and `<prefix>/lib32`) if it is appropriate on the system. - It only searches for `<prefix>/lib/<arch>` for the right `<arch>`, unlike the Python code searches for `lib/<arch>` in a generic way (e.g., the Python code searches for `/usr/lib/x86_64-linux-gnu` but in reality systems have `/usr/lib/x86_64-some-customized-name-linux-gnu`, see https://unix.stackexchange.com/a/226180/38242 ). --- Regarding for relevant issues: - https://github.com/pytorch/pytorch/issues/12063 and https://github.com/pytorch/pytorch/issues/2877: These are properly handled, as explained in the updated comment. - https://github.com/pytorch/pytorch/issues/2941 does not changes NCCL detection specifically for Windows (it changed CUDA detection). - `b7e258f81e` A versioned library detection is added, but the order is reversed: The unversioned library becomes preferred. This is because normally unversioned libraries are linked to versioned libraries and preferred by users, and local installation by users are often unversioned. Like the document of [find_library](https://cmake.org/cmake/help/v3.8/command/find_library.html) suggests: > When using this to specify names with and without a version suffix, we recommend specifying the unversioned name first so that locally-built packages can be found before those provided by distributions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22930 Differential Revision: D16440275 Pulled By: ezyang fbshipit-source-id: 11fe80743d4fe89b1ed6f96d5d996496e8ec01aa	2019-07-23 08:45:51 -07:00
Edward Yang	798d5d9771	Revert D16281714: Add sanity checks for NCCL detection. Differential Revision: D16281714 Original commit changeset: 396bcbf099bd fbshipit-source-id: a22cc112d1b6a62d689f9d8a7f93e8be3abe2a44	2019-07-16 13:58:27 -07:00
Will Feng	01f03d56ee	Revert D16283037: Add sanity checks for NCCL detection. Differential Revision: D16283037 Original commit changeset: fc09c9443a56 fbshipit-source-id: 30cdf7b1ad91498ee615d018de5571ba36f4383e	2019-07-16 13:20:43 -07:00
Hong Xu	31497799b9	Add sanity checks for NCCL detection. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22819 Test Plan: Imported from OSS Differential Revision: D16283037 Pulled By: ezyang fbshipit-source-id: fc09c9443a568d9af1c78a847282a7d707c49dd6	2019-07-16 11:32:36 -07:00
Hong Xu	e2046f8c1d	Add sanity checks for NCCL detection. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22819 Test Plan: Imported from OSS Differential Revision: D16281714 Pulled By: ezyang fbshipit-source-id: 396bcbf099bd07b996cf779c6b43092096b52d90	2019-07-16 11:32:32 -07:00
Hui Wu	07ef85e326	Add USE_MKLDNN_CBLAS build option. (#19014 ) Summary: MKL-DNN is the main library for computation when we use ideep device. It can use kernels implemented by different algorithms (including JIT, CBLAS, etc.) for computation. We add the "USE_MKLDNN_CBLAS" (default OFF) build option so that users can decide whether to use CBLAS computation methods or not. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19014 Differential Revision: D16094090 Pulled By: ezyang fbshipit-source-id: 3f0b1d1a59a327ea0d1456e2752f2edd78d96ccc	2019-07-02 12:29:54 -07:00
Hong Xu	e6d4a2d289	Remove unused file cmake/Modules/FindMIOpen.cmake (#22244 ) Summary: `cmake/public/LoadHIP.cmake` calls `find_package(miopen)`, which uses the CMake module in MIOpen installation (It includes the line `set(miopen_DIR ${MIOPEN_PATH}/lib/cmake/miopen)`). `cmake/Modules/FindMIOpen.cmake` is not used. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22244 Differential Revision: D16000771 Pulled By: bddppq fbshipit-source-id: 07bb40fdf033521e8427fc351715d47e6e30ed34	2019-06-26 21:21:46 -07:00
Ilia Cherniavskii	6350dbddd1	Fix sequential MKL case (#22062 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22062 ghimport-source-id: a30255d7453c4ffecf40215a785c1e06b7296368 Test Plan: USE_CUDA=0 PARALLEL_BACKEND=OPENMP BLAS=MKL USE_MKLDNN=1 MKL_SEQ=1 MKLDNN_THREADING=SEQ BUILD_BINARY=1 python setup.py develop --cmake ./build/bin/parallel_info Imported from OSS Differential Revision: D15938079 Pulled By: ilia-cher fbshipit-source-id: e7ef0c5bc75ebb845ebe66bf76a4070d45305b35	2019-06-24 12:56:43 -07:00
bddppq	4940e41d16	Fix mkl-dnn tautological compare error (#21371 ) Summary: ``` ../third_party/ideep/mkl-dnn/src/cpu/jit_avx512_common_convolution.hpp:144:821: error: self-comparison always evaluates to true [-Werror,-Wtautological-compare] virtual pd_t clone() const override { return new pd_t(this); } virtual status_t create_primitive(primitive_t *primitive, const primitive_at_t inputs, const primitive_t *outputs) const override { double ms = get_msec(); primitive_t::input_vector ins(inputs, inputs + this->n_inputs()); primitive_t::outpu t_vector outs(outputs, outputs + this->n_outputs()); auto ret = safe_ptr_assign<primitive_t>(primitive, new (jit_avx512_common_convolution_bwd_data_t)(this, ins, outs)); ms = get_msec() - ms; if (mkldnn_verbose()->level >= 2) { printf("mkldnn_verbose,create,%s,%g\n", this->info(), ms); fflush(0); } return ret; } v irtual const char *name() const override { return (avx512_common == sse42 ? "jit:" "sse42" : (avx512_common == avx ? "jit:" "avx" : (avx512_common == avx2 ? "jit:" "avx2" : (avx512_common == avx512_common ? "jit:" "avx512_common" : (avx512_common == avx512_core ? "jit:" "avx512_core" : (avx512_common == avx512_mic ? "jit:" "avx512_mic" : (avx512_common == avx512_mic_4ops ? "jit:" "avx512_mic_4ops" : "jit:" ""))))))); }; ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/21371 Differential Revision: D15631392 Pulled By: bddppq fbshipit-source-id: 3b0008acab8ae53ce61327686bd8367e7fb5d298	2019-06-04 15:27:07 -07:00
Ilia Cherniavskii	580eab6562	Restore TBB module (#20454 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20454 ghimport-source-id: 14aca1dedbe647d41e55e7538a6b7eeab0fc4384 Differential Revision: D15326062 Pulled By: ilia-cher fbshipit-source-id: 02b005a679b10dc7a264978e87a8d2bb98ab972f	2019-05-28 02:49:36 -07:00
peter	872bab22c6	Some essential changes needed before updating the Windows AMI (#20353 ) Summary: 1. Add cuda 10.1 build 2. Turn on openmp loop support for VS 2019 3. Remove legacy code about selective builds Tested through CI. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20353 Differential Revision: D15294806 Pulled By: ezyang fbshipit-source-id: 0acf5c3fbbc398fd9ebdf9f97653499d39638432	2019-05-10 09:08:51 -07:00
Ilia Cherniavskii	481b6d0268	Allow a non-OpenMP based build (#19749 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19749 ghimport-source-id: a6636c0acddbdc5fd5b0dcb20b9f80cbdb9159b9 Differential Revision: D15141993 Pulled By: ilia-cher fbshipit-source-id: 96085608398b2a4c97c68b2948f5184d07f9ad3d	2019-05-06 19:34:48 -07:00
Edward Yang	48a35135fb	Convert all tabs to spaces, add CI. (#18959 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18959 ghimport-source-id: a934163fa34cb2019732d5f49dc7290c376bf156 Differential Revision: D14831246 Pulled By: ezyang fbshipit-source-id: beb92dc4ee8c82f4c8259c081dd72e477fe7a9d0	2019-04-09 08:12:26 -07:00
Balint Cristian	67fdb4abf7	AVX2 with GCC9 fix. (#18991 ) Summary: Dear All, The proposed patch fixes the test code snippets used in cmake infrastructure, and implicit failure to set properly the ```CAFFE2_COMPILER_SUPPORTS_AVX2_EXTENSIONS``` flag. The libcaffe2.so will have some ```UND``` avx2 related references, rendering it unusable. * Using GCC 9 test code from cmake build infra always fails: ``` $ gcc -O2 -g -pipe -Wall -m64 -mtune=generic -fopenmp -DCXX_HAS_AVX_1 -fPIE -o test.o -c test.c -mavx2 test.c: In function ‘main’: test.c:11:26: error: incompatible type for argument 1 of ‘_mm256_extract_epi64’ 11 \| _mm256_extract_epi64(x, 0); // we rely on this in our AVX2 code \| ^ \| \| \| __m256 {aka __vector(8) float} In file included from /usr/lib/gcc/x86_64-redhat-linux/9/include/immintrin.h:51, from test.c:4: /usr/lib/gcc/x86_64-redhat-linux/9/include/avxintrin.h:550:31: note: expected ‘__m256i’ {aka ‘__vector(4) long long int’} but argument is of type ‘__m256’ {aka ‘__vector(8) float’} 550 \| _mm256_extract_epi64 (__m256i __X, const int __N) \| $ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/9/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-redhat-linux Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl --enable-offload-targets=nvptx-none --without-cuda-driver --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux Thread model: posix gcc version 9.0.1 20190328 (Red Hat 9.0.1-0.12) (GCC) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/18991 Differential Revision: D14821838 Pulled By: ezyang fbshipit-source-id: 7eb3a854a1a831f6fda8ed7ad089746230b529d7	2019-04-07 08:27:00 -07:00
Thomas Viehmann	13bc002422	fixes for AVX detection (#17915 ) Summary: Our AVX2 routines use functions such as _mm256_extract_epi64 that do not exist on 32 bit systems even when they have AVX2. This disables AVX2 when _mm256_extract_epi64 does not exist. This fixes the "local" part of #17901 (except disabling FBGEMM), but there also is sleef to be updated and NNPACK to be fixed, see the bug report for further discussion. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17915 Differential Revision: D14437338 Pulled By: soumith fbshipit-source-id: d4ef7e0801b5d1222a855a38ec207dd88b4680da	2019-03-13 03:55:06 -07:00
JerryShih	73db487a8e	Update the cmake build configuration for AppleClang compiler (#15820 ) Summary: This pr try to merge the https://github.com/pytorch/pytorch/pull/11563 again and fix the linking error in https://github.com/pytorch/pytorch/pull/14837. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15820 Differential Revision: D13942024 Pulled By: ezyang fbshipit-source-id: dc6d1e9c4b0f177914f3745665244272a03ce33c	2019-02-04 08:53:47 -08:00
SsnL	13422fca32	Add torch.backends.openmp.is_available(); fix some cmake messages (#16425 ) Summary: 1. add `torch.backends.openmp.is_available()` 2. Improve various `cmake` outputs 3. Fix LDFLAGS not respected by `caffe2_pybind11_state_*` targets 4. Fix `MKL` warning message, and QUIET flag. 5. Fix various typos Pull Request resolved: https://github.com/pytorch/pytorch/pull/16425 Differential Revision: D13903395 Pulled By: soumith fbshipit-source-id: d15c5d46f53e1ff1c27fca2887b9d23d0bd85b4d	2019-01-31 16:15:46 -08:00
rtarquini	879bccb1af	Support for Jetson Xavier (#15660 ) Summary: The request changes are to support building Pytorch 1.0 on the Jetson Xavier with Openblas. Jetson Xavier with Jetpack 3.3 has generic lapack installed. To pick up the CUDA accelerated BLAS/Lapack, I had to build Openblas and build/link pytorch from source. Otherwise, I got a runtime error indicating lapack routines were not cuda enabled. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15660 Differential Revision: D13571324 Pulled By: soumith fbshipit-source-id: 9b148d081d6e7fa7e1824dfdd93283c67f69e683	2019-01-02 18:51:42 -08:00
Gu, Jinghui	12e0ed55b4	Upgrade MKL-DNN to version 0.17 and static build MKL-DNN (#15504 ) Summary: Upgrade MKl-DNN to 0.17 and static build MKL-DNN to fix the potentail build error due to old mkldnn version in host system. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15504 Differential Revision: D13547885 Pulled By: soumith fbshipit-source-id: 46f790a3d9289c1e153e51c62be17c5206ea8f9a	2018-12-25 22:56:51 -08:00
Edward Yang	54d8ce94ee	Revert D13383102: [pytorch][PR] Upgrade MKL-DNN to version 0.17 Differential Revision: D13383102 Original commit changeset: c434f0e0ddff fbshipit-source-id: 690f46ca0710954fa591a5ea77535e9759db4de5	2018-12-18 07:39:20 -08:00
Gu, Jinghui	4b97a46421	Disable strict-overflow flag to avoid compilation error (#14977 ) Summary: Disable strict-overflow flag to avoid compilation error Pull Request resolved: https://github.com/pytorch/pytorch/pull/14977 Differential Revision: D13447577 Pulled By: soumith fbshipit-source-id: 1957bd5aa3c7b79219da3dd53560464977c89526	2018-12-12 22:41:33 -08:00
Gu, Jinghui	70598740ec	Upgrade MKL-DNN to version 0.17 (#14308 ) Summary: upgrade MKL-DNN to version 0.17 update mkldnn bridge to latest. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14308 Differential Revision: D13383102 Pulled By: yinghai fbshipit-source-id: c434f0e0ddff2ee2c86db2d6c44a37298fd005a3	2018-12-07 16:44:50 -08:00
Gu, Jinghui	6aee5488b5	correct omp dependency for mkl-dnn (#13449 ) Summary: The motivational of this PR is to enforce mkldnn to use the same omp version of caffe2 framework. Meanwhile, do not change other assumptions within mkldnn. Previously, the MKL_cmake_included is set in caffe2 in order to disable omp seeking in mkldnn. But, with such change, mkldnn has no chance to adapt for mkl found by caffe2. Then, some building flags of mkl will be not set in mkldnn. For example, USE_MKL, USE_CBLAS, etc. In this PR, we enforce set the MKLIOMP5LIB for mkldnn according to caffe2, and tell the mkl root path in MKLROOT for mkldnn. Then, mkldnn is built as expected. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13449 Differential Revision: D12899504 Pulled By: yinghai fbshipit-source-id: 22a196bd00b4ef0a11d350a32c049304613edf52	2018-11-06 10:48:09 -08:00
Gu, Jinghui	dbab9b73b6	seperate mkl, mklml, and mkldnn (#12170 ) Summary: 1. Remove avx2 support in mkldnn 2. Seperate mkl, mklml, and mkldnn 3. Fix convfusion test case Pull Request resolved: https://github.com/pytorch/pytorch/pull/12170 Reviewed By: yinghai Differential Revision: D10207126 Pulled By: orionr fbshipit-source-id: 1e62eb47943f426a89d57e2d2606439f2b04fd51	2018-10-29 10:52:55 -07:00
Christian Puhrsch	f564163951	Remove SSE-only code and convolve5x5 (#12109 ) Summary: Performance oriented code will use AVX/AVX2, so we don't need SSE specific code anymore. This will also reduce the probability of running into an error on legacy CPUs. On top of this convolve is covered by modern libraries such as MKLDNN, which are much more performant and which we now build against by default (even for builds from source). Pull Request resolved: https://github.com/pytorch/pytorch/pull/12109 Differential Revision: D10055134 Pulled By: colesbury fbshipit-source-id: 789b8a34d5936d9c144bcde410c30f7eb1c776fa	2018-10-09 10:53:50 -07:00
Gu, Jinghui	c064f8a89d	Fix build error mkldnn due to corruptted CMAKE_REQUIRED_LIBRARIES (#12195 ) Summary: This is to fix cmake-time compilation error. When we change script to build Caffe2 with mkldnn, we run into some cmake-time compilation support check (like in libsleef) failed due to incorrect setting of CMAKE_REQUIRED_LIBRARIES. It is a global setting which can interfere camke compilation if it is not clean up properly. FindBLAS.cmake and FindLAPACK.cmake didn't clean this flag, and causes incorrect building of libsleef.so. yinghai gujinghui Pull Request resolved: https://github.com/pytorch/pytorch/pull/12195 Differential Revision: D10159314 Pulled By: yinghai fbshipit-source-id: 04908738f7d005579605b9c2a58d54f035d3baf4	2018-10-04 11:56:06 -07:00
Yinghai Lu	658386a63f	Make USE_IDEEP work again (#12026 ) Summary: This PR establish a baseline so that we can build IDEEP ops in the new work flow. From this baseline, we need to - Merge the CMakefile of MKLDNN from caffe2 and Pytorch - Get rid of `USE_MKL=ON`. Build command from now on: ``` EXTRA_CAFFE2_CMAKE_FLAGS="-DUSE_MKL=ON -DINTEL_COMPILER_DIR=/opt/IntelComposerXE/2017.0.098" python setup.py build_deps ``` gujinghui Pull Request resolved: https://github.com/pytorch/pytorch/pull/12026 Differential Revision: D10041199 Pulled By: yinghai fbshipit-source-id: b7310bd84a494ac899d8e25da368b63feed4eeaf	2018-09-25 16:56:29 -07:00
Soumith Chintala	77af40c025	prioritize Accelerate over OpenBLAS (#11812 ) Summary: might fix some binary build issues Pull Request resolved: https://github.com/pytorch/pytorch/pull/11812 Reviewed By: ezyang Differential Revision: D9927309 Pulled By: soumith fbshipit-source-id: 9ed6c2c6fedc2a1cffbf52bc0a795135d4239800	2018-09-18 21:56:57 -07:00
Anders Papitto	a853a74217	defer resolution of mkl to a cmake wrapper library (#11298 ) Summary: this is a fix that's needed for building extensions with a pre-packaged pytorch. Consider the scenario where (1) pytorch is compiled and packaged on machine A (2) the package is downloaded and installed on machine B (3) an extension is compiled on machine B, using the downloaded package Before this patch, stage (1) would embed absolute paths to the system installation of mkl into the generated Caffe2Config.cmake, leading to failures in stage (3) if mkl was not at the same location on B as on A. After this patch, only a reference to the wrapper library is embedded, which is re-resolved on machine B. We are already using a similar approach for cuda. Testing: built a package on jenkins, downloaded locally and compiled an extension. Works with this patch, fails without. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11298 Differential Revision: D9683150 Pulled By: anderspapitto fbshipit-source-id: 06a80c3cd2966860ce04f76143b358de15f94aa4	2018-09-06 09:10:39 -07:00
Johannes M Dieterich	a4c59a9dab	MIOpen integration, more tests enabled, bug fixes (#10612 ) Summary: * first integration of MIOpen for batch norm and conv on ROCm * workaround a ROCm compiler bug exposed by elementwise_kernel through explicit capture of variables in the densest packing * workaround a ROCm compiler bug exposed by having `extern "C" __host__` as a definition and just `__host__` in the implementation through the hipify script * use fabs() in accordance with C++11 for double absolute, not ::abs() which is integer-only on ROCm * enable test_sparse set on CI, skip tests that don't work currently on ROCm * enable more tests in test_optim after the elementwise_bug got fixed * enable more tests in test_dataloader * improvements to hipification and ROCm build With this, resnet18 on CIFAR data trains without hang or crash in our tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10612 Reviewed By: bddppq Differential Revision: D9423872 Pulled By: ezyang fbshipit-source-id: 22c0c985217d65c593f35762b3eb16969ad96bdd	2018-08-23 15:24:47 -07:00
peter	facb293aad	Fix FindMKL.cmake for Windows (#10453 ) Summary: Targets the issue discussed at https://github.com/pytorch/pytorch/pull/7399#issuecomment-400788971. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10453 Differential Revision: D9311591 Pulled By: soumith fbshipit-source-id: ac0712e10bdac4ea3f76d6fbad2178ec958b3a31	2018-08-13 21:09:27 -07:00
Jesse Hellemn	def3715e82	Minor changes for nicer pip packages (#9544 ) Summary: I am using this to test a CI job to upload pip packages, and so am using the Caffe2 namespace to avoid affecting the existing pytorch packages. Pull Request resolved: https://github.com/pytorch/pytorch/pull/9544 Reviewed By: orionr Differential Revision: D9267111 Pulled By: pjh5 fbshipit-source-id: a68162ed29d2eb9ce353d8435ccb5f16c3b0b894	2018-08-10 12:09:46 -07:00
Yinghai Lu	766fa1fc96	Fix IDEEP CMakefile (#9217 ) Summary: The reason is that we are referencing `__ideep_looked_for` here: `77484d91db/cmake/Modules/FindMKL.cmake (L350)` This was first flushed out in https://github.com/pytorch/pytorch/pull/8105 and probably can help with https://github.com/pytorch/pytorch/issues/9024 Pull Request resolved: https://github.com/pytorch/pytorch/pull/9217 Reviewed By: houseroad Differential Revision: D8754491 Pulled By: yinghai fbshipit-source-id: 70aecc2d60684b9ea522403dc98a0a1a2c3db7e6	2018-07-06 15:28:07 -07:00
Yinghai Lu	c3b499227d	Avoid iomp/gomp clash when building IDEEP ops (#8955 ) Summary: This PR does 3 things - Reorder the search order of `intel_lp64` and `gf_lp64` as the first one is more essential and should have high priority. - Avoid repetitive searching of MKL libraries in `ideep` and `mkldnn` submodule if we already found those in `FindMKL` - Avoid adding more MKL dependencies to IDEEP if MKL is also found. TODO: provide an option for user to chose iomp or gomp. Closes https://github.com/pytorch/pytorch/pull/8955 Reviewed By: bddppq Differential Revision: D8666960 Pulled By: yinghai fbshipit-source-id: 669d3142204a8b47c19a900444246fc44a139012	2018-06-27 21:24:36 -07:00
Pieter Noordhuis	8e019826c9	Fix cmake cudnn autodetection (#8891 ) If CUDNN_INCLUDE_DIR, CUDNN_LIB_DIR, and/or CUDNN_ROOT_DIR were set, but USE_CUDNN was not explicitly set, the code in cmake/Dependencies.cmake would set USE_CUDNN=OFF even though it could be found. This caused an issue in ATen, where it includes its CuDNN bindings if the variable CUDNN_FOUND is set. This was the case, because the find_package call in cmake/public/cuda.cmake searches for CuDNN and ends up finding it. The net result is that ATen tried to compile CuDNN bits, but the caffe2::cudnn target is never defined let alone added as dependency, and the build fails on not being able to find the header cudnn.h. This change does two things: 1) Restore CuDNN autodetection by setting USE_CUDNN=ON if it is found. 2) Remove obsolete FindCuDNN.cmake module. This functionality now lives in cmake/public/cuda.cmake.	2018-06-26 06:54:27 -07:00
Teng Li	a994b432ee	[c10d] NCCL Process Group implementation (#8182 ) * [c10d] Process Group NCCL implementation * Addressed comments * Added one missing return and clang format again * Use cmake/Modules for everything and fix gloo build * Fixed compiler warnings * Deleted duplicated FindNCCL	2018-06-08 10:33:27 -07:00
Yinghai Lu	fb5cc630f6	Fix me (#7837 ) * Mini fix * No USE_MKL * Add CAFFE2_USE_EIGEN_FOR_BLAS	2018-05-25 07:38:50 -07:00
Yinghai Lu	144c5d1ff3	Overwrite INTEL_MKL_DIR correctly (#7824 )	2018-05-24 15:04:25 -07:00
Yinghai Lu	71bad33cc4	Match parenthesis (#7797 )	2018-05-24 13:45:23 -07:00
Orion Reblitz-Richardson	4bf0202cac	[build] Have PyTorch depend on minimal libcaffe2.so instead of libATen.so (#7399 ) * Have PyTorch depend on minimal libcaffe2.so instead of libATen.so * Build ATen tests as a part of Caffe2 build * Hopefully cufft and nvcc fPIC fixes * Make ATen install components optional * Add tests back for ATen and fix TH build * Fixes for test_install.sh script * Fixes for cpp_build/build_all.sh * Fixes for aten/tools/run_tests.sh * Switch ATen cmake calls to USE_CUDA instead of NO_CUDA * Attempt at fix for aten/tools/run_tests.sh * Fix typo in last commit * Fix valgrind call after pushd * Be forgiving about USE_CUDA disable like PyTorch * More fixes on the install side * Link all libcaffe2 during test run * Make cuDNN optional for ATen right now * Potential fix for non-CUDA builds * Use NCCL_ROOT_DIR environment variable * Pass -fPIC through nvcc to base compiler/linker * Remove THCUNN.h requirement for libtorch gen * Add Mac test for -Wmaybe-uninitialized * Potential Windows and Mac fixes * Move MSVC target props to shared function * Disable cpp_build/libtorch tests on Mac * Disable sleef for Windows builds * Move protos under BUILD_CAFFE2 * Remove space from linker flags passed with -Wl * Remove ATen from Caffe2 dep libs since directly included * Potential Windows fixes * Preserve options while sleef builds * Force BUILD_SHARED_LIBS flag for Caffe2 builds * Set DYLD_LIBRARY_PATH and LD_LIBRARY_PATH for Mac testing * Pass TORCH_CUDA_ARCH_LIST directly in cuda.cmake * Fixes for the last two changes * Potential fix for Mac build failure * Switch Caffe2 to build_caffe2 dir to not conflict * Cleanup FindMKL.cmake * Another attempt at Mac cpp_build fix * Clear cpp-build directory for Mac builds * Disable test in Mac build/test to match cmake	2018-05-24 07:47:27 -07:00
Paul Jesse Hellemn	b875fb281c	Update from facebook (#7451 ) * [bootcamp] Improve "Shape" operator to support axes specification To improve .shape operator of Caffe2 to support x.shape(tensor, axes), which takes an optional int array "axes" as input. For example, x.shape(tensor, [1, 0]) will return the dimension for axis 1 and 0 following the specified order. For current version, "axes" input allows duplications and can have arbitrary length. * Back out "Add barrier net that runs before training nets" Original commit changeset: b373fdc9c30f. Need additional changes to some callers to support barrier failures. * Change warning to verbose log to reduce log spam The `LOG(WARNING)` was a bit spammy for regular use so lets just make it a `VLOG`. * Extract the shared code from different caffe2_benchmark binaries The OSS benchmark and Internal benchmark will share most functions in the benchmark. * Support MFR in sequence training As titled. * Make knowledge distillation work with using logged prediction feature as teacher label. 1) Add loading raw dense feature as teacher label. 2) Optional calibration function for teacher label 3) Add teacher label into generic unit test 4) Deprecated TTSN workflow version using feature_options to config teacher label * [C2/CUDA]: unjoined cross entropy sigmoid as desc * Add async_scheduling executor into deferrable_net_exec_test Add async_scheduling into tests and fix some exception cases * Fix Event disabled error When disabling event in RNN ops make sure we don't call Finish on disabled event from op's RunAsync * cuda ensure cpu output op can handle both TensorCPU and TensorCUDA as desc. * [C2 Core] Infer input device option in C2 hypothesis_test checkers Improve how we default input blob device options. Previously it defaults as where op lives but it is not necessarily the case. For example: CopyCPUToGPU * [C2 Op]SplitByLengthsOp CPU/GPU implementation [C2 Op]SplitByLengthsOp CPU/GPU implementation * fix undefined symbol error not sure why we're getting undefined symbol even with link_whole = True Need to figure out why but need this workaround for now * Add tools in DAIPlayground platform to help debugging models Add additional tools to allow Plauground override individual method defined in AnyExp. This will allow user to create module that specificly change certain default method behavior. An example included in this diff is deactivating test model and checkpointing. When debugging any model problems, switching off components helps me quickly narrow down the location of the bug. The technique is extensively used in task T27038712 (Steady memory increase in EDPM, eventually resulting in gloo/cuda.cu:34: out of memory) * add shape and type inference for int8 conversion operator * Fix flaky test for group_norm Fix flaky test for group_norm * Fix group_norm_op_test flaky Fix group_norm_op_test flaky * Implementation of composite learning rate policy In many state-of-the-arts deep learning works, people use a simple trick to schedule the learning rate: use a fixed learning rate until error plateaus and then switch to a different fixed learning rate, and so on. In this diff, we implemented a simple version of the composite learning rate. The user gives a set of learning rates policies and corresponding iteration nums, and the optimizer will change the learning rate policy based on the number of iterations so far. For example, the user give two learning rate policies, one is FixedLearningRate and PolyLearningRate, with an iteration number of 1k. Then the first 1k iteration, we use FixedLearningRate. For the following iterations, we use PolyLearningRate. * Split two use cases of CachedReader into two classes, DBFileReader and CachedReader # Use Cases: 1). input: DB file -> output: DatasetReader. Use DBFileReader. 2). input: Reader -> build cache DB file -> output: DatasetReader. Use CachedReader. # Changes to CachedReader: 1). Move db_path to the constructor. Because in mock reader. cache will always be built ahead. # Changes to tests: 1). Make a separate TestCase class for CachedReader and DBFileReader. 2). Make it possible to add more test functions by adding setUp, tearDown and _make_temp_path. 3). Make delete db_path more general. `db_path` could be a file for `log_file_db`, but could also be a directory for `leveldb`. * Back out "On Mobile phones, call GlobalInit with no arguments in predictor in case we need to perform initialization" Original commit changeset: 4489c6133f11 * Fix LARS bug Fixed a bug in the LARS implementation which caused all subsequent blobs not using LARS to have the LARS learning rate multiplier applied to them. * [tum] support sparse init & add uniformFill option as title * Propagate exception for async nets Capture the exception when an exception is thrown in async nets and re-throw it after wait(). This allows exceptions to be propagated up to the caller. This diff was a part of D7752068. We split the diff so that C2 core files changes are in a separate diff. * Automatic update of fbcode/onnx to 69894f207dfcd72d1e70497d387201cec327efbc Previous import was 403ccfbd0161c38f0834413d790bad0874afbf9a Included changes: - [69894f2](https://github.com/onnx/onnx/commit/69894f2): Use op schema.all tensor types in random like definitions (#865) <Scott McKay> - [b9d6b90](https://github.com/onnx/onnx/commit/b9d6b90): Clarify random like operators (#846) <Scott McKay> - [fc6b5fb](https://github.com/onnx/onnx/commit/fc6b5fb): Refactor shape inference implementation (#855) <anderspapitto> - [b7d8dc8](https://github.com/onnx/onnx/commit/b7d8dc8): fix cmake warning message (#863) <Eric S. Yu> - [f585c5d](https://github.com/onnx/onnx/commit/f585c5d): add pytorch-operator test for tile (#831) <Wenhao Hu> - [993fe70](https://github.com/onnx/onnx/commit/993fe70): add install step (#832) <Eric S. Yu> - [68bc26c](https://github.com/onnx/onnx/commit/68bc26c): add type inference for traditional ml ops except classifier ops. (#857) <Ke Zhang> - [9cc0cda](https://github.com/onnx/onnx/commit/9cc0cda): fix string representation of scalar types (#858) <G. Ramalingam> - [1078925](https://github.com/onnx/onnx/commit/1078925): fix y in pow test case to scalar (#852) <Wenhao Hu> - [c66fb6f](https://github.com/onnx/onnx/commit/c66fb6f): Add some math function shape inference (#845) <anderspapitto> - [ff667d1](https://github.com/onnx/onnx/commit/ff667d1): Refactor return type and docs for ONNXIFI_BACKEND_DIRECTX_ID (#853) <Marat Dukhan> - [11c6876](https://github.com/onnx/onnx/commit/11c6876): clear initializer names when clear initializer (#849) <Wenhao Hu> - [73c34ae](https://github.com/onnx/onnx/commit/73c34ae): Clarify FeatureVectorizer description. (#843) <Scott McKay> - [1befb9b](https://github.com/onnx/onnx/commit/1befb9b): Remove useless text in docs (#850) <Lu Fang> - [e84788f](https://github.com/onnx/onnx/commit/e84788f): Fix SELU attributes' default values (#839) <Lu Fang> - [ebac046](https://github.com/onnx/onnx/commit/ebac046): Add tile test case (#823) <Wenhao Hu> - [8b7a925](https://github.com/onnx/onnx/commit/8b7a925): a few more shape inference functions (#772) <anderspapitto> - [9718f42](https://github.com/onnx/onnx/commit/9718f42): Make the coefficient non optional for LinearClassifier (#836) <Jaliya Ekanayake> - [ef083d0](https://github.com/onnx/onnx/commit/ef083d0): Add save_tensor and load_tensor functions for Protos (#770) <Lu Fang> - [45ceb55](https://github.com/onnx/onnx/commit/45ceb55): Check if CMAKE_BUILD_TYPE set before project(). (#812) <Sergii Dymchenko> - [4b3d2b0](https://github.com/onnx/onnx/commit/4b3d2b0): [WIP] reenable shape inference tests (#834) <anderspapitto> - [22d17ee](https://github.com/onnx/onnx/commit/22d17ee): RNN tests: LSTM, GRU, SimpleRNN (#739) <Peyman Manikashani> - [de65b95](https://github.com/onnx/onnx/commit/de65b95): dimension denotation (#443) <Tian Jin> - [eccc76e](https://github.com/onnx/onnx/commit/eccc76e): fix field number issue in onnx operator proto and enable its build (#829) <Ke Zhang> - [d582beb](https://github.com/onnx/onnx/commit/d582beb): disable shape inference test to unbreak ci (#830) <Lu Fang> - [485b787](https://github.com/onnx/onnx/commit/485b787): function proto for composite op. (#802) <Ke Zhang> - [cd58928](https://github.com/onnx/onnx/commit/cd58928): specify defaults for attributes of Affine op (#820) <G. Ramalingam> - [7ee2cf9](https://github.com/onnx/onnx/commit/7ee2cf9): merge the dummy backend back into the main one (#743) <anderspapitto> - [1c03a5a](https://github.com/onnx/onnx/commit/1c03a5a): [Proposal] ONNX Interface for Framework Integration (previously ONNX Backend API) header and docs (#551) <Marat Dukhan> - [3769a98](https://github.com/onnx/onnx/commit/3769a98): Rename real model test case from VGG-16 to ZFNet (#821) <Lu Fang> * [C2]ReluN Op relu n op. tf reference: https://www.tensorflow.org/api_docs/python/tf/nn/relu6 * Call destructor when assigning a blob value * Add executor overrides Add executor overrides flag to enable migration to async_scheduling executor * Add barrier net that runs before training nets - attempt #2 Add a synchonize barrier net that is run before training nets. With this net, shards that are faster will wait for other shards before start training. This reduce chances of the faster shards timing out during GLOO AllReduce. Removed explicit data_parallel_model.py.synchronize call in holmes workflow. This change was landed previously but caused errors for some EDPM workflows - See https://fb.facebook.com/groups/1426530000692545/permalink/1906766366002237/ - because EDPM assumes any call to CreateOrCloneCommonWorld and Gloo ops are wrapped in exception handlers but in this case exception thrown in the barrier init net is not handled. To address this issue, we add _CreateOrCloneCommonWorld to the param_init_net instead of a new barrier init net. Since errors for param_init_net run is handled gracefully and re-rendezvous, it should fixes the problem. * Handle empty nets in async_scheduling Make sure we don't get stuck on empty nets * use CUDA_ARCH for conditional compile * [C2 fix] infer function for ensure_cpu_output_op * Update group_norm test to reduce flaky test * Fix lr_multiplier for GPU	2018-05-10 23:14:27 -07:00
Jinghui	769397eb77	[Caffe2] [feature request] Add gradient operators for IDEEP (#7234 ) * Add gradient operators for IDEEP Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Add gradient test cases for IDEEP Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Upgrade third_party/ideep Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Refine SumOp for IDEEP Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Share input buffer in fallback op if possible Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Fallback ConvTranspose op for IDEEP Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Fix bug introduced by the patch of sharing input buffer Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Share output buffer in fallback operators Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Remove IDEEP to resolve repo issue Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Reflash IDEEP repo Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Remove redundant lines in IDEEP Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Fallback operators for IDEEP (Flatten, ResizeLike, Transpose, and Reshape) Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>	2018-05-09 08:52:24 -07:00
Yinghai Lu	ea24c7ff1b	Remove cdft library requirement from MKL (#7246 )	2018-05-07 15:31:30 -07:00
Orion Reblitz-Richardson	aa38ae303d	[build] Setup to build ATen from root CMake file (#7163 ) * Setup to build ATen from root CMake file * Move aten/src/TH/cmake into cmake/Modules * Add special code path for FindMKL for merge	2018-05-02 19:33:31 -07:00
Yinghai Lu	8b70f7d248	[Caffe2] Clean up ideep integration (#6881 ) * Clean up ideep integrtation * . * Remove redundant code in convnet benchmark * MKL ON * Do not add -mavx2 everywhere * . * Comments * rename * .	2018-04-24 18:32:35 -07:00
Jinghui	26ddefbda1	[feature request] [Caffe2] Enable MKLDNN support for inference (#6699 ) * Add operators based-on IDEEP interfaces Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Enable IDEEP as a caffe2 device Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Add test cases for IDEEP ops Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Add IDEEP as a caffe2 submodule Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Skip test cases if no IDEEP support Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Correct cmake options for IDEEP Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Add dependences on ideep libraries Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Fix issues in IDEEP conv ops and etc. Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Move ideep from caffe2/ideep to caffe2/contrib/ideep Signed-off-by: Gu Jinghui <jinghui.gu@intel.com> * Update IDEEP to fix cmake issue Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Fix cmake issue caused by USE_MKL option Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Correct comments in MKL cmake file Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>	2018-04-22 21:58:14 -07:00
Marat Dukhan	63b5cc47eb	[caffe2] Minor changes in NNPACK CMake scripts (#6532 ) - Tell NNPACK to not link pthreadpool, but only its headers - Remove FindNNPACK.cmake as it is no longer used	2018-04-11 20:56:38 -04:00
Soumith Chintala	108f5c197f	[pytorch] add static linkage support for CuDNN and NCCL (#6410 ) * when linking static CUDA libs, additional dep on culibos.a * add USE_STATIC_NCCL option * add USE_STATIC_CUDNN option * remove libATen soversion * add caffe, caffe2 folders to setup.py exclude list	2018-04-08 22:54:18 -04:00
Yangqing Jia	4aded2f7c1	Add Numa support (#2152 )	2018-03-05 23:30:20 -08:00
Yangqing Jia	1f9df59de9	Move caffe_option to proper cmake_dependent_option (#2049 )	2018-02-24 23:31:36 -08:00
Yangqing Jia	fe5fe7bad2	CMake cuda targets (#1993 ) * wip: cuda targets * Remove FindCuDNN.cmake as it is no longer needed	2018-02-22 15:54:34 -05:00
sf-wind	5439ab3cdc	Remove gf library in MKL (#1976 ) * Remove OpenGL code from benchmark * Make it possible to print plot in the ipython notbook * Create the blob if the blob is not specified in the init net * Do not use gf library for MKL. Even after I install the entire MKL library it is still not found. After removing it, the MKL code can still run	2018-02-20 15:17:34 -08:00
Yangqing Jia	d481afb125	Modernizing glog. Same as gflags. Summary: Same as PR #1819. Closes https://github.com/caffe2/caffe2/pull/1830 Differential Revision: D6832171 Pulled By: Yangqing fbshipit-source-id: 462a9b807e78d60748160a0cfd24932c9003fcc3	2018-01-28 18:21:22 -08:00
Yangqing Jia	73ed0d5ced	Modernizing the gflags dependency in cmake. Summary: Historically, for interface dependent libraries (glog, gflags and protobuf), exposing them in Caffe2Config.cmake is usually difficult. New versions of glog and gflags ship with new-style cmake targets, so one does not need to use variables. New-style targets also make it easier for people to depend on them in installed config files. This diff modernizes the gflags library, and still provides a fallback path if the installed gflags does not have cmake config files coming with it. It does change one behavior of the build process though - when one specifies -DUSE_GFLAGS=ON but gflags cannot be found, the old script automatically turns it off but the new script crashes, forcing the user to specify USE_GFLAGS=OFF. Closes https://github.com/caffe2/caffe2/pull/1819 Differential Revision: D6826604 Pulled By: Yangqing fbshipit-source-id: 210f3926f291c8bfeb24eb9671e5adfcbf8cf7fe	2018-01-27 19:31:14 -08:00
Marat Dukhan	cd9d0f4561	Link cpuinfo when using external NNPACK Summary: Close #1685 Closes https://github.com/caffe2/caffe2/pull/1722 Differential Revision: D6686071 Pulled By: Maratyszcza fbshipit-source-id: bbe86bfd479376bc7cdfdd0bad3896f1c2356216	2018-01-09 12:50:52 -08:00
Pieter Noordhuis	54342287fe	Look for NCCL in CUDA_TOOLKIT_ROOT_DIR Summary: Closes https://github.com/caffe2/caffe2/pull/1611 Reviewed By: dzhulgakov Differential Revision: D6550168 Pulled By: pietern fbshipit-source-id: e034ce4057d37bfc8b53949c56cbcb701ea5d958	2017-12-12 21:50:49 -08:00
Pieter Noordhuis	db06e91097	Bump gloo Summary: Latest version of Gloo takes care of MPI_Init/MPI_Finalize for us, so this commit removes handling that from caffe2/contrib/gloo. It also imports CMake NCCL module changes from Gloo to stay consistent and allow setting NCCL_INCLUDE_DIR and NCCL_LIB_DIR separately. Closes https://github.com/caffe2/caffe2/pull/1295 Reviewed By: dzhulgakov Differential Revision: D5979364 Pulled By: pietern fbshipit-source-id: 794b00b0a445317c30a13cc8f0f4dc38e590cc77	2017-10-05 16:57:59 -07:00
Luke Yeager	c858c68537	cmake: stop including files from the install directory Summary: Here is the buggy behavior which this change fixes: * On the first configure with CMake, a system-wide benchmark installation is not found, so we use the version in `third_party/` ([see here](https://github.com/caffe2/caffe2/blob/v0.8.1/cmake/Dependencies.cmake#L98-L100)) * On installation, the benchmark sub-project installs its headers to `CMAKE_INSTALL_PREFIX` ([see here](https://github.com/google/benchmark/blob/4bf28e611b/src/CMakeLists.txt#L41-L44)) * On a rebuild, CMake searches the system again for a benchmark installation (see https://github.com/caffe2/caffe2/issues/916 for details on why the first search is not cached) * CMake includes `CMAKE_INSTALL_PREFIX` when searching the system ([docs](https://cmake.org/cmake/help/v3.0/variable/CMAKE_SYSTEM_PREFIX_PATH.html)) * Voila, a "system" installation of benchmark is found at `CMAKE_INSTALL_PREFIX` * On a rebuild, `-isystem $CMAKE_INSTALL_PREFIX/include` is added to every build target ([see here](https://github.com/caffe2/caffe2/blob/v0.8.1/cmake/Dependencies.cmake#L97)). e.g: cd /caffe2/build/caffe2/binaries && ccache /usr/bin/c++ -I/caffe2/build -isystem /caffe2/third_party/googletest/googletest/include -isystem /caffe2/install/include -isystem /usr/include/opencv -isystem /caffe2/third_party/eigen -isystem /usr/include/python2.7 -isystem /usr/lib/python2.7/dist-packages/numpy/core/include -isystem /caffe2/third_party/pybind11/include -isystem /usr/local/cuda/include -isystem /caffe2/third_party/cub -I/caffe2 -I/caffe2/build_host_protoc/include -fopenmp -std=c++11 -O2 -fPIC -Wno-narrowing -O3 -DNDEBUG -o CMakeFiles/split_db.dir/split_db.cc.o -c /caffe2/caffe2/binaries/split_db.cc This causes two issues: 1. Since the headers and libraries at `CMAKE_INSTALL_PREFIX` have a later timestamp than the built files, an unnecessary rebuild is triggered 2. Out-dated headers from the install directory are used during compilation, which can lead to strange build errors (which can usually be fixed by `rm -rf`'ing the install directory) Possible solutions: * Stop searching the system for an install of benchmark, and always use the version in `third_party/` * Cache the initial result of the system-wide search for benchmark, so we don't accidentally pick up the installed version later * Hack CMake to stop looking for headers and libraries in the installation directory This PR is an implementation of the first solution. Feel free to close this and fix the issue in another way if you like. Closes https://github.com/caffe2/caffe2/pull/1112 Differential Revision: D5761750 Pulled By: Yangqing fbshipit-source-id: 2240088994ffafdb6eedb3626d898b505a4ba564	2017-09-01 23:33:14 -07:00
Pieter Noordhuis	45e6e71198	Tidy up CMake for NCCL Summary: Use HINTS instead of PATHS for find_library so that you can specify -DNCCL_ROOT_DIR and it will use this NCCL installation regardless of what else is installed on your system. Also add a path hint to include the default base path for NCCL 2 libraries. Closes https://github.com/caffe2/caffe2/pull/1152 Reviewed By: Yangqing Differential Revision: D5740053 Pulled By: pietern fbshipit-source-id: 43f0908a63e8a9b90320dece0bbb558827433b48	2017-08-30 15:39:56 -07:00
Pieter Noordhuis	813cca85d1	Use CMake HINTS to find CuDNN Summary: The PATHS suggestion to find_library is searched after everything else. By using HINTS, it searches CUDNN_ROOT_DIR much earlier, avoiding potential conflicts with other paths that have the CuDNN header. Closes https://github.com/caffe2/caffe2/pull/1122 Reviewed By: Yangqing Differential Revision: D5701822 Pulled By: pietern fbshipit-source-id: 3f15757701aff167e7ae2a3e8a4ccf5d96763a0c	2017-08-24 15:35:24 -07:00
Guillaume Dumont	8cc9dbf357	Added Ninja generator support on Windows Summary: I successfully built caffe2 using MSVC 2015 and the Ninja Generator. I use vcpkg to build glfags, glog, lmdb and protobuf. Here is my build procedure: 1. Install vcpkg and set it up according to vcpkg docs 2. Install dependencies ``` $> vcpkg install gflags glog lmdb protobuf eigen3 --triplet x64-windows-static ``` 3. Run CMake with this batch file ```Batch setlocal if NOT DEFINED VCPKG_DIR ( echo "Please defined VCPKG_DIR" && exit /b 1 ) if NOT DEFINED CMAKE_BUILD_TYPE set CMAKE_BUILD_TYPE=Release if NOT DEFINED BUILD_DIR set BUILD_DIR=build_%CMAKE_BUILD_TYPE% if NOT DEFINED USE_CUDA set USE_CUDA=OFF call "%VS140COMNTOOLS%\..\..\VC\vcvarsall.bat" amd64 if NOT EXIST %BUILD_DIR% (mkdir %BUILD_DIR%) pushd %BUILD_DIR% set CMAKE_GENERATOR=Ninja set ZLIB_LIBRARY=%VCPKG_DIR%\installed\x64-windows-static\lib\zlib.lib cmake -G"%CMAKE_GENERATOR%" ^ -DBUILD_SHARED_LIBS=OFF ^ -DCMAKE_VERBOSE_MAKEFILE=1 ^ -DBUILD_TEST=OFF ^ -DBUILD_SHARED_LIBS=OFF ^ -DCMAKE_BUILD_TYPE=%CMAKE_BUILD_TYPE% ^ -DUSE_CUDA=%USE_CUDA% ^ -DZLIB_LIBRARY:FILEPATH="%ZLIB_LIBRARY%" ^ -DVCPKG_TARGET_TRIPLET=x64-windows-static ^ -DVCPKG_APPLOCAL_DEPS:BOOL=OFF ^ -DCMAKE_TOOLCHAIN_FILE:FILEPATH=%VCPKG_DIR%\scripts\buildsystems\vcpkg.cmake ^ -DPROTOBUF_PROTOC_EXECUTABLE:FILEPATH=%VCPKG_DIR%\installed\x64-windows-static\tools\protoc.exe ^ ..\ ninja popd endlocal ``` Closes https://github.com/caffe2/caffe2/pull/880 Differential Revision: D5497384 Pulled By: Yangqing fbshipit-source-id: e0d81d3dbd3286ab925eddef0e6fbf99eb6375a5	2017-07-26 00:32:20 -07:00
Daniel Bermond	0458985c1b	Fix build with external nnpack installation Summary: libpthreadpool is needed during the linking stage and is missing when user chooses to use an external nnpack installation (from system libraries). Fixes GitHub issue #459. Detailed discussion on [this comment](https://github.com/caffe2/caffe2/issues/459#issuecomment-308831547). Closes https://github.com/caffe2/caffe2/pull/808 Differential Revision: D5430318 Pulled By: Yangqing fbshipit-source-id: 5e10332fb01e54d8360bb929c1a82b0eef580bbb	2017-07-25 23:03:39 -07:00
Guillaume Dumont	feecb09517	Added sensible default root location for MKL on Windows Summary: MKL on windows works with this change. Tested with MKL 2017 Update 3 (https://software.intel.com/en-us/articles/intel-math-kernel-library-intel-mkl-2017-release-notes). Should fix #544 With MKL 2017 Update 3 #514 should not happen too. Note: I used Anaconda which ships with its own MKL, so I had to make sure that the MKL 2017 Update 3 version was loaded by replacing the .dll in the `%AnacondaPrefix%\Library\bin` folder. Otherwise, numpy would load it's own version and I would have all sorts of missing procedures errors. Now that the same version is available through `conda` this is easily fixed with `conda install mkl==2017.0.3` Closes https://github.com/caffe2/caffe2/pull/929 Differential Revision: D5429664 Pulled By: Yangqing fbshipit-source-id: eaa150bab563ee4ce8348faee1624ac4af477513	2017-07-14 17:20:36 -07:00
haracejacob	2ec294a8bb	Fix a few typos and grammars in comment Summary: Fix a few typos and grammars in comment by using language-check, python library spell_checker source code is here : https://github.com/17-1-SKKU-OSS/011A/blob/master/spell_checker/spell_checker.py here is the text file which indicates what things should be fixed : https://github.com/17-1-SKKU-OSS/011A/tree/master/spell_checker/fix/caffe2 Closes https://github.com/caffe2/caffe2/pull/719 Differential Revision: D5165118 Pulled By: aaronmarkham fbshipit-source-id: 7fb8ef7a99d03cd5fd2f9ebdb01b9865e90fc37b	2017-06-14 18:22:39 -07:00
Hans Gaiser	567842e68d	Check system dependencies first Summary: This PR changes the cmake of Caffe2 to look for system dependencies before resorting to the submodules in `third-party`. Only googletest should logically be in third-party, the other libraries should ideally be installed as system dependencies by the user. This PR adds system dependency checks for Gloo, CUB, pybind11, Eigen and benchmark, as these were missing from the cmake files. In addition it removes the execution of `git submodule update --init` in cmake. This seems like bad behavior to me, it should be up to the user to download submodules and manage the git repository. Closes https://github.com/caffe2/caffe2/pull/382 Differential Revision: D5124123 Pulled By: Yangqing fbshipit-source-id: cc34dda58ffec447874a89d01058721c02a52476	2017-05-24 14:31:51 -07:00
Du Tran	033ab9da1b	Adding video data layer for caffe2 Summary: Adding a simple video data layer which allows to read video data from frames, videos and output 5D tensor. It also allows multiple labels. The current implementation is based on ffmpeg Differential Revision: D4801798 fbshipit-source-id: 46448e9c65fb055c2d71855447383a33ade0e444	2017-05-05 14:16:38 -07:00
Yangqing Jia	1aa5231fb3	make nnpack build on mac/linux, and also contbuild support Summary: * add custom ninja install * minimal build for nnpack * force -fPIC for nnpack Closes https://github.com/caffe2/caffe2/pull/207 Differential Revision: D4729265 Pulled By: Yangqing fbshipit-source-id: 2ed345a4fda6b4811af03cd1898e2402dda58701	2017-03-17 15:19:07 -07:00
Yangqing Jia	1741fd839f	Re-apply windows diff D4657831 Summary: (Note: previous revert was due to a race condition between D4657831 and D4659953 that I failed to catch.) After this, we should have contbuild guarding the Windows build both with and without CUDA. This includes a series of changes that are needed to make Windows build, specifically: (1) Various flags that are needed in the cmake system, specially dealing with /MD, /MT, cuda, cudnn, whole static linking, etc. (2) Contbuild scripts based on appveyo. (3) For Windows build, note that one will need to use "cmake --build" to build stuff so that the build type is consistent between configuration and actual build. see scripts\build_windows.bat for details. (4) In logging.h, ERROR is already defined by Windows. I don't have a good solution now, and as a result, LOG(ERROR) on windows is going to be LOG(INFO). (5) variable length array is not supported by MSVC (and it is not part of C++ standard). As a result I replaced them with vectors. (6) sched.h is not available on Windows, so akyrola 's awesome simple async net might encounter some slowdown due to no affinity setting on Windows. (7) MSVC has a bug that does not work very well with template calls inide a templated function call, which is a known issue that should be fixed in MSVC 2017. However for now this means changes to conv_op_impl.h and recurrent_net_op.h. No actual functionalities are changed. (8) std host function calls are not supported in CUDA8+MSVC, so I changed lp_pool (and maybe a few others) to use cuda device functions. (9) The current Scale and Axpy has heavy templating that does not work well with MSVC. As a result I reverted azzolini 's changes to the Scale and Axpy interface, moved the fixed-length version to ScaleFixedSize and AxpyFixedSize. (10) CUDA + MSVC does not deal with Eigen well, so I guarded all Eigen parts to only the non-CUDA part. (11) In conclusion, it is fun but painful to deal with visual c++. Differential Revision: D4666745 fbshipit-source-id: 3c9035083067bdb19a16d9c345c1ce66b6a86600	2017-03-07 11:02:12 -08:00
Avani Nandini	039c3cf0ba	Revert D4657831: [caffe2][PR] Changes for Windows build to pass. Summary: This reverts commit 070ded372ed78a7e3e3919fdffa1d337640f146e Differential Revision: D4657831 fbshipit-source-id: 3a0fb403936a9257776d637ce3ba5dbd81e1119f	2017-03-06 21:02:36 -08:00
Yangqing Jia	7b8c7b11d2	Changes for Windows build to pass. Summary: After this, we should have contbuild guarding the Windows build both with and without CUDA. This includes a series of changes that are needed to make Windows build, specifically: (1) Various flags that are needed in the cmake system, specially dealing with /MD, /MT, cuda, cudnn, whole static linking, etc. (2) Contbuild scripts based on appveyo. (3) For Windows build, note that one will need to use "cmake --build" to build stuff so that the build type is consistent between configuration and actual build. see scripts\build_windows.bat for details. (4) In logging.h, ERROR is already defined by Windows. I don't have a good solution now, and as a result, LOG(ERROR) on windows is going to be LOG(INFO). (5) variable length array is not supported by MSVC (and it is not part of C++ standard). As a result I replaced them with vectors. (6) sched.h is not available on Windows, so akyrola 's awesome simple async net might encounter some slowdown due to no affinity setting on Windows. (7) MSVC has a Closes https://github.com/caffe2/caffe2/pull/183 Reviewed By: ajtulloch Differential Revision: D4657831 Pulled By: Yangqing fbshipit-source-id: 070ded372ed78a7e3e3919fdffa1d337640f146e	2017-03-06 20:03:37 -08:00
Bram Wasti	0d5f3654b2	Adding back untracked files from manual github pull Summary: Github import didn't work and the manual import lost some files. Reviewed By: Yangqing Differential Revision: D4408509 fbshipit-source-id: ec8edb8c02876410f0ef212bde6847a7ba327fe4	2017-01-12 08:59:19 -08:00
Yangqing Jia	1cd166d330	CMake completions work Summary: Closes https://github.com/caffe2/caffe2/pull/88 Differential Revision: D4404292 Pulled By: bwasti fbshipit-source-id: 8a4351c2dee5136aaa12b90f1a61fd7afee51994	2017-01-11 16:59:22 -08:00
Bram Wasti	1aa473638d	Added a search path to find OpenBLAS for convenience (homebrew install)	2016-12-29 16:15:25 -05:00
Simon Layton	99e97a4b7a	Correction to paths to find cuDNN	2016-12-16 16:03:23 -05:00
Simon Layton	fbbb87cd46	Enhancements Add BLAS chooser Move cuDNN detection from Cuda -> FindCuDNN Refactor main C2 libs, should enable no-GPU build (untested)	2016-12-13 09:29:01 -05:00
Simon Layton	52f09fe2c9	Initial building with deps	2016-12-13 09:29:01 -05:00

1 2 3 4 5

229 Commits