Summary:
Dear All,
The proposed patch fixes the test code snippets used in cmake infrastructure, and implicit failure to set properly the ```CAFFE2_COMPILER_SUPPORTS_AVX2_EXTENSIONS``` flag. The libcaffe2.so will have some ```UND``` avx2 related references, rendering it unusable.
* Using GCC 9 test code from cmake build infra always fails:
```
$ gcc -O2 -g -pipe -Wall -m64 -mtune=generic -fopenmp -DCXX_HAS_AVX_1 -fPIE -o test.o -c test.c -mavx2
test.c: In function ‘main’:
test.c:11:26: error: incompatible type for argument 1 of ‘_mm256_extract_epi64’
11 | _mm256_extract_epi64(x, 0); // we rely on this in our AVX2 code
| ^
| |
| __m256 {aka __vector(8) float}
In file included from /usr/lib/gcc/x86_64-redhat-linux/9/include/immintrin.h:51,
from test.c:4:
/usr/lib/gcc/x86_64-redhat-linux/9/include/avxintrin.h:550:31: note: expected ‘__m256i’ {aka ‘__vector(4) long long int’} but argument is of type ‘__m256’ {aka ‘__vector(8) float’}
550 | _mm256_extract_epi64 (__m256i __X, const int __N)
|
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/9/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl --enable-offload-targets=nvptx-none --without-cuda-driver --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 9.0.1 20190328 (Red Hat 9.0.1-0.12) (GCC)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18991
Differential Revision: D14821838
Pulled By: ezyang
fbshipit-source-id: 7eb3a854a1a831f6fda8ed7ad089746230b529d7
Summary:
Our AVX2 routines use functions such as _mm256_extract_epi64
that do not exist on 32 bit systems even when they have AVX2.
This disables AVX2 when _mm256_extract_epi64 does not exist.
This fixes the "local" part of #17901 (except disabling FBGEMM),
but there also is sleef to be updated and NNPACK to be fixed,
see the bug report for further discussion.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17915
Differential Revision: D14437338
Pulled By: soumith
fbshipit-source-id: d4ef7e0801b5d1222a855a38ec207dd88b4680da
* Have PyTorch depend on minimal libcaffe2.so instead of libATen.so
* Build ATen tests as a part of Caffe2 build
* Hopefully cufft and nvcc fPIC fixes
* Make ATen install components optional
* Add tests back for ATen and fix TH build
* Fixes for test_install.sh script
* Fixes for cpp_build/build_all.sh
* Fixes for aten/tools/run_tests.sh
* Switch ATen cmake calls to USE_CUDA instead of NO_CUDA
* Attempt at fix for aten/tools/run_tests.sh
* Fix typo in last commit
* Fix valgrind call after pushd
* Be forgiving about USE_CUDA disable like PyTorch
* More fixes on the install side
* Link all libcaffe2 during test run
* Make cuDNN optional for ATen right now
* Potential fix for non-CUDA builds
* Use NCCL_ROOT_DIR environment variable
* Pass -fPIC through nvcc to base compiler/linker
* Remove THCUNN.h requirement for libtorch gen
* Add Mac test for -Wmaybe-uninitialized
* Potential Windows and Mac fixes
* Move MSVC target props to shared function
* Disable cpp_build/libtorch tests on Mac
* Disable sleef for Windows builds
* Move protos under BUILD_CAFFE2
* Remove space from linker flags passed with -Wl
* Remove ATen from Caffe2 dep libs since directly included
* Potential Windows fixes
* Preserve options while sleef builds
* Force BUILD_SHARED_LIBS flag for Caffe2 builds
* Set DYLD_LIBRARY_PATH and LD_LIBRARY_PATH for Mac testing
* Pass TORCH_CUDA_ARCH_LIST directly in cuda.cmake
* Fixes for the last two changes
* Potential fix for Mac build failure
* Switch Caffe2 to build_caffe2 dir to not conflict
* Cleanup FindMKL.cmake
* Another attempt at Mac cpp_build fix
* Clear cpp-build directory for Mac builds
* Disable test in Mac build/test to match cmake
* Update FindCUDA to cmake master as of 561238bb6f07a5ab31293928bd98f6f8911d8bc1
NB: I DID have to apply one local patch; it's the `include_guard` change. Should
be obvious next time you do an update.
Relevant commits:
commit 23119366e9d4e56e13c1fdec9dbff5e8f8c55ee5
Author: Edward Z. Yang <ezyang@fb.com>
Date: Wed Mar 28 11:33:56 2018 -0400
FindCUDA: Make nvcc configurable via CUDA_NVCC_EXECUTABLE env var
This is useful if, for example, you want ccache to be used
for nvcc. With the current behavior, cmake always picks up
/usr/local/cuda/bin/nvcc, even if there is a ccache nvcc
stub in the PATH. Allowing for CUDA_NVCC_EXECUTABLE lets
us work around the problem.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
commit e743fc8e9137692232f0220ac901f5a15cbd62cf
Author: Henry Fredrick Schreiner <henry.fredrick.schreiner@cern.ch>
Date: Thu Mar 15 15:30:50 2018 +0100
FindCUDA/select_compute_arch: Add support for CUDA as a language
Even though this is an internal module, we can still prepare it to
be used in another public-facing module outside of `FindCUDA`.
Issue: #16586
commit 193082a3c803a6418f0f1b5976dc34a91cf30805
Author: luz.paz <luzpaz@users.noreply.github.com>
Date: Thu Feb 8 06:27:21 2018 -0500
MAINT: Misc. typos
Found via `codespell -q 3 -I ../cmake-whitelist.txt`.
commit 9f74aaeb7d6649241c4a478410e87d092c462960
Author: Brad King <brad.king@kitware.com>
Date: Tue Jan 30 08:18:11 2018 -0500
FindCUDA: Fix regression in per-config flags
Changes in commit 48f7e2d300 (Unhardcode the CMAKE_CONFIGURATION_TYPES
values, 2017-11-27) accidentally left `CUDA_configuration_types`
undefined, but this is used in a few places to handle per-config flags.
Restore it.
Fixes: #17671
commit d91b2d9158cbe5d65bfcc8f7512503d7f226ad91
Author: luz.paz <luzpaz@users.noreply.github.com>
Date: Wed Jan 10 12:34:14 2018 -0500
MAINT: Misc. typos
Found via `codespell`
commit d08f3f551fa94b13a1d43338eaed68bcecb95cff
Merge: 1be22978e 1f4d7a071
Author: Brad King <brad.king@kitware.com>
Date: Wed Jan 10 15:34:57 2018 +0000
Merge topic 'unhardcode-configuration-types'
1f4d7a07 Help: Add references and backticks in LINK_FLAGS prop_tgt
48f7e2d3 Unhardcode the CMAKE_CONFIGURATION_TYPES values
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !1345
commit 5fbfa18fadf945963687cd95627c1bc62b68948a
Merge: bc88329e5 ff41a4b81
Author: Brad King <brad.king@kitware.com>
Date: Tue Jan 9 14:26:35 2018 +0000
Merge topic 'FindCUDA-deduplicate-c+std-host-flags'
ff41a4b8 FindCUDA: de-duplicates C++11 flag when propagating host flags.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !1628
commit bc88329e5ba7b1a14538f23f4fa223ac8d6d5895
Merge: 89d127463 fab1b432e
Author: Brad King <brad.king@kitware.com>
Date: Tue Jan 9 14:26:16 2018 +0000
Merge topic 'msvc2017-findcuda'
fab1b432 FindCUDA: Update to properly find MSVC 2017 compiler tools
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1631
commit 48f7e2d30000dc57c31d3e3ab81077950704a587
Author: Beren Minor <beren.minor+git@gmail.com>
Date: Mon Nov 27 19:22:11 2017 +0100
Unhardcode the CMAKE_CONFIGURATION_TYPES values
This removes duplicated code for per-config variable initialization by
providing a `cmake_initialize_per_config_variable(<PREFIX> <DOCSTRING>)`
function.
This function initializes a `<PREFIX>` cache variable from `<PREFIX>_INIT`
and unless the `CMAKE_NOT_USING_CONFIG_FLAGS` variable is defined, does
the same with `<PREFIX>_<CONFIG>` from `<PREFIX>_<CONFIG>_INIT` for every
`<CONFIG>` in `CMAKE_CONFIGURATION_TYPES` for multi-config generators or
`CMAKE_BUILD_TYPE` for single-config generators.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Polyfill CMakeInitializeConfigs
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Tweak condition for when to use bundled FindCUDA support.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Comment out include_guard.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Continuation of https://github.com/caffe2/caffe2/pull/2306 and based on Yangqing's PR at https://github.com/caffe2/caffe2/pull/2326
* Put caffe2_protos as static library and link it whole to libcaffe2.so
* For protobuf::libprotobuf, only link it to libcaffe2_protos (and hence libcaffe2.so), but not any downstream library. This avoids manipulating protobuf objects across dll boundaries.
* After the above, during linking one will receive complaint that fixed_address_empty_string is not found. This is because we compiled protobuf with hidden visibility, and the fact that the generated caffe2.pb.h has an inline function that invokes the inline function in protobuf GetEmptyStringAlreadyInited()
* Added sed-like commands to replace the generated header to use caffe2::GetEmptyStringAlreadyInited() instead. And, in proto_utils.cc, implement a function that essentially routes the function call to protobuf's internal one. The reason this works is that, caffe2::G... is visible globally, and libcaffe2.so is able to see the real protobuf one. This ensures that we are always calling protobuf functions that are inside libcaffe2.so.
* cmake target - work in progress
* wip cmake public targets
* Add missing INTERFACE keyword
* Add cuda public dependencies
* Add dependency for test targets
Summary:
Last fix was uncommitted due to a bug in internal build (CAFFE2_API causing error). This one re-applies it as well as a few more, especially enabling gtest.
Earlier commit message: Basically, this should make windows {static_lib, shared_lib} * {static_runtime, shared_runtime} * {cpu, gpu} work other than gpu shared_lib, which willyd kindly pointed out a symbol limit problem. A few highlights:
(1) Updated newest protobuf.
(2) use protoc dllexport command to ensure proper symbol export for windows.
(3) various code updates to make sure that C2 symbols are properly shown
(4) cmake file changes to make build proper
(5) option to choose static runtime and shared runtime similar to protobuf
(6) revert to visual studio 2015 as current cuda and msvc 2017 do not play well together.
(7) enabled gtest and fixed testing bugs.
Earlier PR is #1793
Closes https://github.com/caffe2/caffe2/pull/1827
Differential Revision: D6832086
Pulled By: Yangqing
fbshipit-source-id: 85f86e9a992ee5c53c70b484b761c9d6aed721df
Summary:
This reverts commit d286264fccc72bf90a2fcd7da533ecca23ce557e
bypass-lint
An infra SEV is better than not reverting this diff.
If you copy this password, see you in SEV Review!
cause_a_sev_many_files
Differential Revision: D6817719
fbshipit-source-id: 8fe0ad7aba75caaa4c3cac5e0a804ab957a1b836
Summary:
Basically, this should make windows {static_lib, shared_lib} * {static_runtime, shared_runtime} * {cpu, gpu} work. A few highlights:
(1) Updated newest protobuf.
(2) use protoc dllexport command to ensure proper symbol export.
(3) various code updates to make sure that C2 symbols are properly shown
(4) cmake file changes to make build proper
(5) option to choose static runtime and shared runtime similar to protobuf
(6) revert to visual studio 2015 as current cuda and msvc 2017 do not play well together.
Closes https://github.com/caffe2/caffe2/pull/1793
Reviewed By: dzhulgakov
Differential Revision: D6817719
Pulled By: Yangqing
fbshipit-source-id: d286264fccc72bf90a2fcd7da533ecca23ce557e
Summary:
Adds 2 features:
(1) In cmake, allow the use of -march=native
(2) During initialization, check if Caffe2 is built with matching cpu
features of the current machine.
This helps us guarding performance claims in case the Caffe2 baseline is
built with limited computation capability.
Currently only added avx, avx2 and fma which are common.
Closes https://github.com/caffe2/caffe2/pull/1775
Reviewed By: ezyang
Differential Revision: D6772059
Pulled By: Yangqing
fbshipit-source-id: 884a3d7c7a71ed9631b7c6269ae95d842a09e1bd
Summary:
This is in principle similar to #1612 and is tested on Windows 2017. CMake passes, although there are still bugs in the MSVC compiler that prevents cuda to compile properly.
The difference between this and #1612 is that this diff explicitly puts the CMake files into a separate folder and uses a MiscCheck.cmake chunk of code to test whether we need to include them. See README.txt for more details.
Closes https://github.com/caffe2/caffe2/pull/1727
Reviewed By: pietern
Differential Revision: D6693656
Pulled By: Yangqing
fbshipit-source-id: a74b0a1fde436d7bb2002a56affbc7bbb41ec621
Summary:
Thanks to feldim2425 we know that GCC 5 in Ubuntu 17.04 and later
doesn't define the macro _GLIBCXX_USE_C99 and by extension the
std::to_string, std::stoi, and std::stod functions (and probably
more). Instead of avoiding using these functions, we simply recommend
people to use GCC 6 or higher on the newer Ubuntu versions where GCC 5
doesn't work.
As a side note, CUDA 8.0 is compatible with GCC up to version 5. This
implies that compiling Caffe2 with CUDA on Ubuntu >= 17.10 implies
using CUDA >= 9.0. If you need to compile with CUDA 8.0 and are on
Ubuntu, you are stuck on version 16.04 or lower.
I verified this fix by running cmake on Ubuntu 17.10 with
-DCMAKE_CXX_COMPILER=/usr/bin/g++5 and observing the fatal error.
This closes#1633.
Closes https://github.com/caffe2/caffe2/pull/1645
Differential Revision: D6620812
Pulled By: pietern
fbshipit-source-id: 29af88cad9bede4fd952084c404c85db05baa9c4
Summary:
This means warnings and errors fire sooner rather than later.
This requires a fix for an issue where CMAKE_REQUIRED_FLAGS propagates
to some unrelated check, which then fails, because the Android
compiler doesn't support -mavx2.
Closes https://github.com/caffe2/caffe2/pull/1646
Differential Revision: D6620129
Pulled By: pietern
fbshipit-source-id: 4d1185406ebee3a523d39811bca6783bee82c898
Summary:
This will help releasing models that are using Caffe2 but have their own operator implementations and extensions. More detailed docs to arrive later. Let's see what contbuild says.
Closes https://github.com/caffe2/caffe2/pull/1378
Differential Revision: D6155045
Pulled By: Yangqing
fbshipit-source-id: 657a4c8de2f8e095bad5ed5db5b3e476b2a877e1
Summary:
This would allow one to debug with asan. Known problems:
- only works with new -fsanitizer=address option.
- not tested on clang.
It's turned off in default so existing builds won't be affected.
Closes https://github.com/caffe2/caffe2/pull/1299
Differential Revision: D5987034
Pulled By: Yangqing
fbshipit-source-id: de29cd3b84edaed5db73e33f8f759c5c3271b5b7
Summary:
After this, windows should be all green.
Closes https://github.com/caffe2/caffe2/pull/1228
Reviewed By: bwasti
Differential Revision: D5888328
Pulled By: Yangqing
fbshipit-source-id: 98fd39a4424237f2910df69c8609455d7af3ca34
Summary:
Using file(WRITE) caused the file to be rewritten for every CMake
reconfigure, which was causing unnecessary full rebuilds of the project
even when no source files changed.
The new strategy has the added benefit of enforcing that the macros.h file
is always generated correctly. When the main project relies on this
header for macro definitions (instead of relying on add_definitions()),
we can be more confident that the project will build correctly when used
as a library (which is the whole point of the macros.h file).
Upsides:
* No more unnecessary rebuilds
* Higher confidence that the project will compile properly as a third-party library
Downsides:
* Developers need to add an entry to `macros.h.in` whenever they would have added a new definition with `add_definitions()`
Closes https://github.com/caffe2/caffe2/pull/1103
Differential Revision: D5680367
Pulled By: Yangqing
fbshipit-source-id: 4db29c28589efda1b6a3f5f88752e3984260a0f2
Summary:
This PR replaces PR #464. It requires C+11 support using the
new CMake variables (`CMAKE_CXX_STANDARD`, `CMAKE_CXX_STANDARD_REQUIRED`,
etc.) when CMake is version 3.1 or above. Otherwise, if CMake is older
(e.g. Ubuntu 14.04) it falls back to using the -std=c++11 flag and
issues a warning.
This PR is based on the comment from Yangqing:
https://github.com/caffe2/caffe2/pull/464#issuecomment-305376923
The corresponding line in cmake/MiscCheck.cmake is removed in order to
reduce redundancy. Another option would be to move the C++11 logic to MiscCheck.cmake.
Closes https://github.com/caffe2/caffe2/pull/1027
Differential Revision: D5590646
Pulled By: Yangqing
fbshipit-source-id: 11ac63fbeaab7a1da02115549e214f9c529f1873
Summary:
While I was trying to make a quick oss cmakefile, I found that some of the
ios source files are out of sync with the most code changes. This diff should
fix the issues.
I manually ran cmake on the oss side with scripts/build_ios.sh to make sure
things pass.
Reviewed By: ajtulloch
Differential Revision: D5582265
fbshipit-source-id: 2636d353d32fcd8fb7087385b9bbed8476e33e74
Summary:
aaronmarkham this solves your Windows build issue. Basically:
(1) VS 2017 does not have CUDA support yet, and we will be waiting on NVidia to do so.
(2) VS 2015 and 2017 need different cmake generator strings.
This PR shows how to determine those and also updates appveyor to do contbuild guard for the following 3 settings:
- VS2015 without cuda
- VS2017 without cuda
- VS2015 with cuda
Closes https://github.com/caffe2/caffe2/pull/210
Differential Revision: D4745007
Pulled By: Yangqing
fbshipit-source-id: 50952552843abd0eb6f4145d9f132daeee3a6794
Summary:
After this, we should have contbuild guarding the Windows build both with
and without CUDA.
This includes a series of changes that are needed to make Windows build,
specifically:
(1) Various flags that are needed in the cmake system, specially dealing
with /MD, /MT, cuda, cudnn, whole static linking, etc.
(2) Contbuild scripts based on appveyo.
(3) For Windows build, note that one will need to use "cmake --build" to
build stuff so that the build type is consistent between configuration and
actual build. see scripts\build_windows.bat for details.
(4) In logging.h, ERROR is already defined by Windows. I don't have a good
solution now, and as a result, LOG(ERROR) on windows is going to be
LOG(INFO).
(5) variable length array is not supported by MSVC (and it is not part of
C++ standard). As a result I replaced them with vectors.
(6) sched.h is not available on Windows, so akyrola 's awesome simple
async net might encounter some slowdown due to no affinity setting on
Windows.
(7) MSVC has a
Closes https://github.com/caffe2/caffe2/pull/183
Reviewed By: ajtulloch
Differential Revision: D4657831
Pulled By: Yangqing
fbshipit-source-id: 070ded372ed78a7e3e3919fdffa1d337640f146e
Summary:
MSVC 2015 has known bugs about template functions so these changes aim to fix them - no functional differences introduced.
Closes https://github.com/caffe2/caffe2/pull/179
Reviewed By: ajtulloch
Differential Revision: D4635241
Pulled By: Yangqing
fbshipit-source-id: a282a96e1e626e9440c1e3f3cb15b5b1fa710887
Summary:
This clears up a bunch of windows build errors, but there are still 12 errors mostly relating to
- template keywords
- initializer list
- pthreadpool
that are not readily available on windows. Also, cuda build is being disabled right now.
Current error can be found here: https://ci.appveyor.com/project/Yangqing/caffe2-w2ucm
Closes https://github.com/caffe2/caffe2/pull/151
Reviewed By: bwasti
Differential Revision: D4564591
Pulled By: Yangqing
fbshipit-source-id: adacad5fa2d6d52d586700947972e3674e3b6e60
Summary:
Turns out that building on raspbian is easy as a cake for caffe2 - cmake is awesome.
Closes https://github.com/caffe2/caffe2/pull/112
Differential Revision: D4480985
Pulled By: Yangqing
fbshipit-source-id: 5dbe5e1e71d8680dea7a5ec8a9ce7fbe6aa5270a