Commit Graph

85 Commits

Author SHA1 Message Date
Siddhartha Menon
e1e6417d4c Add SVE implementation of embedding_lookup_idx (#133995)
Adds an accelerated version of the embedding_lookup_idx perfkernels. This is done via a python codegen file similarly to `caffe2/perfkernels/hp_emblookup_codegen.py`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133995
Approved by: https://github.com/malfet, https://github.com/huydhn
2024-10-15 18:52:44 +00:00
PyTorch MergeBot
dac0b4e62b Revert "Add SVE implementation of embedding_lookup_idx (#133995)"
This reverts commit 770c134998.

Reverted https://github.com/pytorch/pytorch/pull/133995 on behalf of https://github.com/clee2000 due to breaking internal tests, I wondering if this just needs a targets change for buck? ([comment](https://github.com/pytorch/pytorch/pull/133995#issuecomment-2414596554))
2024-10-15 17:23:50 +00:00
Siddhartha Menon
770c134998 Add SVE implementation of embedding_lookup_idx (#133995)
Adds an accelerated version of the embedding_lookup_idx perfkernels. This is done via a python codegen file similarly to `caffe2/perfkernels/hp_emblookup_codegen.py`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133995
Approved by: https://github.com/malfet, https://github.com/huydhn
2024-10-14 10:17:27 +00:00
Nikita Shulga
4fd16dd8aa Clarify that libtorch API is C++17 compatible (#136471)
As it relies on some common C++17 primitives, such as `std::optional`
Replace all docs references from C++14 to C++17

Fixes https://github.com/pytorch/pytorch/issues/133205

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136471
Approved by: https://github.com/kit1980, https://github.com/atalman
2024-09-24 02:03:33 +00:00
cyy
75f141be62 Avoid unnecessary CMake warnings on Windows (#136393)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136393
Approved by: https://github.com/ezyang
2024-09-23 06:42:59 +00:00
cyyever
c638a40a93 [Caffe2] Remove unused AVX512 code (#133160)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133160
Approved by: https://github.com/albanD
2024-08-23 23:16:16 +00:00
PyTorch MergeBot
fa1d7b0262 Revert "Remove unused Caffe2 macros (#132979)"
This reverts commit da65cfbdea.

Reverted https://github.com/pytorch/pytorch/pull/132979 on behalf of https://github.com/ezyang due to these are apparently load bearing internally ([comment](https://github.com/pytorch/pytorch/pull/132979#issuecomment-2284666332))
2024-08-12 18:34:56 +00:00
cyy
da65cfbdea Remove unused Caffe2 macros (#132979)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132979
Approved by: https://github.com/ezyang
2024-08-09 04:48:20 +00:00
cyy
3d617333e7 Simplify CMake code (#127683)
Due to the recent adoption of find(python), it is possible to further simplify some CMake code.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127683
Approved by: https://github.com/ezyang
2024-06-05 15:17:31 +00:00
Nikita Shulga
96e3b3ac72 [BE] Cleanup CMake flag suppressions (#97584)
Use `append_cxx_flag_if_supported` to determine whether or not `-Werror` is supported
Do not suppress deprecation warnings if glog is not used/installed, as the way check is written right now, it will suppress deprecations even if `glog` is not installed.
Similarly, do not suppress deprecations on MacOS simply because we are compiling with protobuf.
Fix deprecation warnings in:
 - MPS by replacing `MTLResourceOptionCPUCacheModeDefault`->`MTLResourceCPUCacheModeDefaultCache`
 - In GTests by replacing `TYPED_TEST_CASE`->`TYPED_TEST_SUITE`
 - In `codegen/onednn/interface.cpp`, by using passing `Stack` by reference rathern than pointer.

Do not guard calls to `append_cxx_flag_if_supported` with `if(CLANG)` or `if(GCC)`.
Fix some deprecated calls in `Metal` hide more complex exception under `C10_CLANG_DIAGNOSTIC_IGNORE`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97584
Approved by: https://github.com/kit1980
2023-03-27 18:46:09 +00:00
cyy
666efd8d5d Improve ASAN and TSAN handling in cmake (#93147)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93147
Approved by: https://github.com/malfet
2023-03-07 14:10:13 +00:00
cyy
9291f9b9e2 Simplify cmake code (#91546)
We use various newer CMake features to simplify build system:
1.Caffe2::threads is replaced by threads::threads.
2.Some unused MSVC flags are removed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91546
Approved by: https://github.com/malfet, https://github.com/Skylion007
2023-02-08 01:05:19 +00:00
cyy
9710ac6531 Some CMake and CUDA cleanup given recent update to C++17 (#90599)
The main changes are:
1. Remove outdated checks for old compiler versions because they can't support C++17.
2. Remove outdated CMake checks because it now requires 3.18.
3. Remove outdated CUDA checks because we are moving to CUDA 11.

Almost all changes are in CMake files for easy audition.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90599
Approved by: https://github.com/soumith
2022-12-30 11:19:26 +00:00
peterjc123
9bb1371cc2 Disable RDYNAMIC check with MSVC (#62949)
Summary:
When testing with clang-cl, the flag is added though it is unsupported and that generates a few warnings. Tried a few alternatives like https://cmake.org/cmake/help/latest/module/CheckLinkerFlag.html, but they just don't work.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62949

Reviewed By: zhouzhuojie, driazati

Differential Revision: D30359206

Pulled By: malfet

fbshipit-source-id: 1bd27ad5772fe6757fa8c3a4bddf904f88d70b7b
2021-08-18 11:51:23 -07:00
Jane Xu
e318058ffe Ignore LNK4099 for debug binary libtorch builds (#62060)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/61979

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62060

Test Plan:
This CI shouldn't break
and https://github.com/pytorch/pytorch/pull/62061

Reviewed By: driazati

Differential Revision: D29877487

Pulled By: janeyx99

fbshipit-source-id: 497f84caab3f9ae609644fd397ad87a6dc8a2a77
2021-07-23 09:31:41 -07:00
Hong Xu
7acb8b71e1 Remove AVX detection code that duplicates FindAVX.cmake (#61748)
Summary:
This PR deletes some code in `MiscCheck.cmake` that perform the exact
same functionality as `FindAVX.cmake`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61748

Reviewed By: ejguan

Differential Revision: D29791282

Pulled By: malfet

fbshipit-source-id: 6595fd1b61c8ae12b821fad8c9a34892dd52d213
2021-07-20 14:34:36 -07:00
Hong Xu
f912889726 Remove unnecessary Ubuntu version checks (#61738)
Summary:
PR https://github.com/pytorch/pytorch/issues/5401 missed another Ubuntu version check in `cmake/MiscCheck.cmake`.

The check for available functions added by https://github.com/pytorch/pytorch/issues/5401 are already present below the code snippet that this PR deletes.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61738

Reviewed By: mrshenli

Differential Revision: D29757525

Pulled By: ezyang

fbshipit-source-id: 7f5f9312284973481a8b8a2b9c51cc09774722e9
2021-07-19 13:04:24 -07:00
shmsong
ee2dd35ef4 Resolving native dependency and try_run for cross compile (#59764)
Summary:
This is a PR on build system that provides support for cross compiling on Jetson platforms.

The major change is:

1. Disable try runs for cross compiling in `COMPILER_WORKS`, `BLAS`, and `CUDA`. They will not be able to perform try run on a cross compile setup

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59764

Reviewed By: soulitzer

Differential Revision: D29524363

Pulled By: malfet

fbshipit-source-id: f06d1ad30b704c9a17d77db686c65c0754db07b8
2021-07-09 09:29:21 -07:00
Elton Leander Pinto
7481c6fc02 Bump googletest version to v1.11.0 (#61395)
Summary:
This PR bumps the `googletest` version to v1.11.0.

To facilitate this change, `CAFFE2_ASAN_FLAG` and `CAFFE2_TSAN_FLAG` are divided into corresponding compiler and linker variants. This is required because `googletest v1.11.0` sets the `-Werror` flag. The `-pie` flag is a linker flag, and passing it to a compiler invocation results in a `-Wunused-command-line-argument` warning, which in turn will cause `googletest` to fail to build with ASAN.

Fixes https://github.com/pytorch/pytorch/issues/60865

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61395

Reviewed By: iramazanli

Differential Revision: D29620970

Pulled By: 1ntEgr8

fbshipit-source-id: cdb1d3d12e0fff834c2e62971e42c03f8c3fbf1b
2021-07-08 16:29:17 -07:00
Chester Liu
bee6b0be58 Fix warning when running scripts/build_ios.sh (#49457)
Summary:
* Fixes `cmake implicitly converting 'string' to 'STRING' type`
* Fixes `clang: warning: argument unused during compilation: '-mfpu=neon-fp16' [-Wunused-command-line-argument]`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49457

Reviewed By: zhangguanheng66

Differential Revision: D25871014

Pulled By: malfet

fbshipit-source-id: fa0c181ae7a1b8668e47f5ac6abd27a1c735ffce
2021-01-11 19:31:32 -08:00
Jane Xu
c2d37cd990 Change CMake config to enable universal binary for Mac (#50243)
Summary:
This PR is a step towards enabling cross compilation from x86_64 to arm64.

The following has been added:
1. When cross compilation is detected, compile a local universal fatfile to use as protoc.
2. For the simple compile check in MiscCheck.cmake, make sure to compile the small snippet as a universal binary in order to run the check.

**Test plan:**

Kick off a minimal build on a mac intel machine with the macOS 11 SDK with this command:
```
CMAKE_OSX_ARCHITECTURES=arm64 USE_MKLDNN=OFF USE_QNNPACK=OFF USE_PYTORCH_QNNPACK=OFF BUILD_TEST=OFF USE_NNPACK=OFF python setup.py install
```
(If you run the above command before this change, or without macOS 11 SDK set up, it will fail.)

Then check the platform of the built binaries using this command:
```
lipo -info build/lib/libfmt.a
```
Output:
- Before this PR, running a regular build via `python setup.py install` (instead of using the flags listed above):
  ```
  Non-fat file: build/lib/libfmt.a is architecture: x86_64
  ```
- Using this PR:
  ```
  Non-fat file: build/lib/libfmt.a is architecture: arm64
  ```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50243

Reviewed By: malfet

Differential Revision: D25849955

Pulled By: janeyx99

fbshipit-source-id: e9853709a7279916f66aa4c4e054dfecced3adb1
2021-01-08 17:26:08 -08:00
peter
e3daf70184 Fix AVX detection with clang-cl (#35653)
Summary:
Defining macros `/D__F16C__` or sth similar won't work on clang-cl.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35653

Differential Revision: D20735878

Pulled By: ezyang

fbshipit-source-id: 392a664b0a9e74222b1a03b8c3f6ebb2c61d867e
2020-03-30 07:53:37 -07:00
peter
45c9ed825a Formatting cmake (to lowercase without space for if/elseif/else/endif) (#35521)
Summary:
Running commands:
```bash
shopt -s globstar

sed -e 's/IF (/if(/g' -e 's/IF(/if(/g' -e 's/if (/if(/g' -e 's/ELSE (/else(/g' -e 's/ELSE(/else(/g' -e 's/else (/else(/g' -e 's/ENDif(/endif(/g' -e 's/ELSEif(/elseif(/g' -i CMakeLists.txt
sed -e 's/IF (/if(/g' -e 's/IF(/if(/g' -e 's/if (/if(/g' -e 's/ELSE (/else(/g' -e 's/ELSE(/else(/g' -e 's/else (/else(/g' -e 's/ENDif(/endif(/g' -e 's/ELSEif(/elseif(/g' -i caffe2/**/CMakeLists.txt
sed -e 's/IF (/if(/g' -e 's/IF(/if(/g' -e 's/if (/if(/g' -e 's/ELSE (/else(/g' -e 's/ELSE(/else(/g' -e 's/else (/else(/g' -e 's/ENDif(/endif(/g' -e 's/ELSEif(/elseif(/g' -i torch/**/CMakeLists.txt
sed -e 's/IF (/if(/g' -e 's/IF(/if(/g' -e 's/if (/if(/g' -e 's/ELSE (/else(/g' -e 's/ELSE(/else(/g' -e 's/else (/else(/g' -e 's/ENDif(/endif(/g' -e 's/ELSEif(/elseif(/g' -i c10/**/CMakeLists.txt
sed -e 's/IF (/if(/g' -e 's/IF(/if(/g' -e 's/if (/if(/g' -e 's/ELSE (/else(/g' -e 's/ELSE(/else(/g' -e 's/else (/else(/g' -e 's/ENDif(/endif(/g' -e 's/ELSEif(/elseif(/g' -i cmake/**/*.cmake
sed -e 's/IF (/if(/g' -e 's/IF(/if(/g' -e 's/if (/if(/g' -e 's/ELSE (/else(/g' -e 's/ELSE(/else(/g' -e 's/else (/else(/g' -e 's/ENDif(/endif(/g' -e 's/ELSEif(/elseif(/g' -i cmake/**/*.cmake.in
```
We may further convert all the commands into lowercase according to the following issue: 77543bde41.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35521

Differential Revision: D20704382

Pulled By: malfet

fbshipit-source-id: 42186b9b1660c34428ab7ceb8d3f7a0ced5d2e80
2020-03-27 14:25:17 -07:00
Nikita Shulga
93983c7d00 Add USE_TSAN option (#35197)
Summary:
Sometimes it is important to run code with thread sanitizer.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35197

Test Plan: CI

Differential Revision: D20605005

Pulled By: malfet

fbshipit-source-id: bcd1a5191b5f859e12b6df6737c980099b1edc36
2020-03-23 14:56:42 -07:00
Hong Xu
daf00beaba Remove duplicated Numa detection code. (#30628)
Summary:
cmake/Dependencies.cmake (1111a6b810/cmake/Dependencies.cmake (L595-L609)) has already detected Numa. Duplicated detection and variables may lead to
incorrect results.

Close https://github.com/pytorch/pytorch/issues/29968
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30628

Differential Revision: D18782479

Pulled By: ezyang

fbshipit-source-id: f74441f03367f11af8fa59b92d656c6fa070fbd0
2020-01-03 08:48:46 -08:00
Sebastian Messmer
bc2e6d10fa Back out "Revert D17908478: Switch PyTorch/Caffe2 to C++14"
Summary: Original commit changeset: 775d2e29be0b

Test Plan: CI

Reviewed By: mruberry

Differential Revision: D18775520

fbshipit-source-id: a350b3f86b66d97241f208786ee67e9a51172eac
2019-12-03 14:33:43 -08:00
Sebastian Messmer
a2ed50c920 Revert D17908478: Switch PyTorch/Caffe2 to C++14
Test Plan: revert-hammer

Differential Revision:
D17908478

Original commit changeset: 6e340024591e

fbshipit-source-id: 775d2e29be0bc3a0db64f164c8960c44d4877d5d
2019-11-27 14:57:05 -08:00
Sebastian Messmer
d0acc9c085 Switch PyTorch/Caffe2 to C++14 (#30406)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30406

ghstack-source-id: 94642238

Test Plan: waitforsandcastle

Differential Revision: D17908478

fbshipit-source-id: 6e340024591ec2c69521668022999df4a33b4ddb
2019-11-27 10:47:31 -08:00
Tao Xu
14c2492fb5 Fix iOS simulator build (#25633)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25633

The iOS simulator build (x86_64) is broken right now. To fix it:

1. Fix the bug in iOS.cmake
2. Disable avx2 for mobile x86_64 build

Test Plan:
1. The `build_ios.sh` can be run successfully for iOS x86 build. The build script I'm using:

	```shell
   	./scripts/build_ios.sh \
   	-DBUILD_CAFFE2_MOBILE=OFF \
	-DIOS_PLATFORM=SIMULATOR \
   	-DUSE_NNPACK=OFF \
   	-DCMAKE_PREFIX_PATH=$(python -c 'from distutils.sysconfig import get_python_lib; print(get_python_lib())') \
   	-DPYTHON_EXECUTABLE=$(python -c 'import sys; print(sys.executable)')
   	```
2. All generated static libs are x86 libs as shown below

	```
	> lipo -i *.a
	Non-fat file: libasmjit.a is architecture: x86_64
	Non-fat file: libc10.a is architecture: x86_64
	Non-fat file: libcaffe2_protos.a is architecture: x86_64
	Non-fat file: libclog.a is architecture: x86_64
	Non-fat file: libcpuinfo.a is architecture: x86_64
	Non-fat file: libfbgemm.a is architecture: x86_64
	Non-fat file: libtorch.a is architecture: x86_64

Differential Revision: D17183803

Pulled By: xta0

fbshipit-source-id: 870d5433a3616b8e7ed9fb7dfab6aebbda26f723
2019-09-04 08:58:25 -07:00
Edward Yang
c56464d13e Turn off warnings on Windows CI. (#24331)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24331

Currently our logs are something like 40M a pop.  Turning off warnings and turning on verbose makefiles (to see the compile commands) reduces this to more like 8M. We could probably reduce log size more but verbose makefile is really useful and we'll keep it turned on for Windows.

Some findings:

1. Setting `CMAKE_VERBOSE_MAKEFILE` inside CMakelists.txt itself as suggested in https://github.com/ninja-build/ninja/issues/900#issuecomment-417917630 does not work on Windows. Setting `-DCMAKE_VERBOSE_MAKEFILE=1` does work (and we respect this environment variable.)
2. The high (`/W3`) warning level is by default on MSVC is due to cmake inserting this in the default flags. On recent versions of cmake, CMP0092 can be used to disable this flag in the default set. The string replace trick sort of works, but the standard snippet you'll find on the internet won't disable the flag from nvcc. I inspected the CUDA cmake code and verified it does respect CMP0092
3. `EHsc` is also in the default flags; this one cannot be suppressed via a policy. The string replace trick seems to work...
4. ... however, it seems nvcc implicitly inserts an `/EHs` after `-Xcompiler` specified flags, which means that if we add `/EHa` to our set of flags, you'll get a warning from nvcc. So we probably have to figure out how to exclude EHa from the nvcc flags set (EHs does seem to work fine.)
5. To suppress warnings in nvcc, you must BOTH pass `-w` and `-Xcompiler /w`. Individually these are not enough.

The patch applies these things; it also fixes a bug where nvcc verbose command printing doesn't work with `-GNinja`.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D17131746

Pulled By: ezyang

fbshipit-source-id: fb142f8677072a5430664b28155373088f074c4b
2019-08-30 07:11:07 -07:00
peter
061f2d1683 Skip useless macros from Windows.h (#25444)
Summary:
Applying https://github.com/pytorch/pytorch/issues/25398 to the whole project.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25444

Differential Revision: D17131251

Pulled By: ezyang

fbshipit-source-id: 7a8817f3444aebd6028bf1056514355e2c4cc748
2019-08-30 06:42:44 -07:00
Jiakai Liu
8cd6d2f101 rename BUILD_ATEN_MOBILE to INTERN_BUILD_MOBILE and make it private (#19942)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19942
ghimport-source-id: 6bacc8f5ad7911af8cf5fde9fcb604ade666b862

Reviewed By: dzhulgakov

Differential Revision: D15144325

Pulled By: ljk53

fbshipit-source-id: d63a70f007110d5d1055d6bec1ed09a1a6aafdae
2019-05-01 00:20:24 -07:00
Balint Cristian
67fdb4abf7 AVX2 with GCC9 fix. (#18991)
Summary:
Dear All,

The proposed patch fixes the test code snippets used in cmake infrastructure, and implicit failure to set properly the ```CAFFE2_COMPILER_SUPPORTS_AVX2_EXTENSIONS``` flag. The libcaffe2.so will have some ```UND``` avx2 related references, rendering it unusable.

* Using GCC 9 test code from cmake build infra always fails:
```
$ gcc  -O2 -g -pipe -Wall -m64 -mtune=generic -fopenmp -DCXX_HAS_AVX_1 -fPIE -o test.o -c test.c -mavx2
test.c: In function ‘main’:
test.c:11:26: error: incompatible type for argument 1 of ‘_mm256_extract_epi64’
   11 |     _mm256_extract_epi64(x, 0); // we rely on this in our AVX2 code
      |                          ^
      |                          |
      |                          __m256 {aka __vector(8) float}
In file included from /usr/lib/gcc/x86_64-redhat-linux/9/include/immintrin.h:51,
                 from test.c:4:
/usr/lib/gcc/x86_64-redhat-linux/9/include/avxintrin.h:550:31: note: expected ‘__m256i’ {aka ‘__vector(4) long long int’} but argument is of type ‘__m256’ {aka ‘__vector(8) float’}
  550 | _mm256_extract_epi64 (__m256i __X, const int __N)
      |

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/9/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl --enable-offload-targets=nvptx-none --without-cuda-driver --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 9.0.1 20190328 (Red Hat 9.0.1-0.12) (GCC)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18991

Differential Revision: D14821838

Pulled By: ezyang

fbshipit-source-id: 7eb3a854a1a831f6fda8ed7ad089746230b529d7
2019-04-07 08:27:00 -07:00
Junjie Bai
0fe6e8c870 Remove ComputeLibrary submodule
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18052

Reviewed By: ezyang

Differential Revision: D14477355

fbshipit-source-id: c56b802f6d69701596c327cf9af6782f30e335fa
2019-03-16 09:06:42 -07:00
Thomas Viehmann
13bc002422 fixes for AVX detection (#17915)
Summary:
Our AVX2 routines use functions such as _mm256_extract_epi64
that do not exist on 32 bit systems even when they have AVX2.
This disables AVX2 when _mm256_extract_epi64 does not exist.

This fixes the "local" part of #17901 (except disabling FBGEMM),
but there also is sleef to be updated and NNPACK to be fixed,
see the bug report for further discussion.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17915

Differential Revision: D14437338

Pulled By: soumith

fbshipit-source-id: d4ef7e0801b5d1222a855a38ec207dd88b4680da
2019-03-13 03:55:06 -07:00
Tongliang Liao
55511004d1 Resolve errors in perfkernel for Windows (#16031)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16031

1. MSVC only has _mm_prefetch(const char*, int). Fixed in both python codegen and C++ files.
2. uint32_t in "cvtsh_ss_bugfix.h" requires "#include <cstdint>".
3. Some files use gflags headers. Add dependency via c10.
4. Isolate arch flags with interface library and private compile options.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15753

Reviewed By: dskhudia

Differential Revision: D13636233

Pulled By: jspark1105

fbshipit-source-id: cdcbd4240e07b749554a2a5676c11af88f23c31d
2019-01-16 21:51:00 -08:00
Daya S Khudia
18de330e86 CMake integration for int8 server operators
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13558

Reviewed By: Maratyszcza

Differential Revision: D12945460

Pulled By: dskhudia

fbshipit-source-id: 1a91027b305fd6af77eebd9a4fad092a12f54712
2018-11-06 15:45:15 -08:00
Orion Reblitz-Richardson
99d24aefc3 Move a number of ATen checks out of Dependencies.cmake (#12990)
Summary:
cc Yangqing mingzhe09088 anderspapitto mingzhe09088
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12990

Differential Revision: D10862301

Pulled By: orionr

fbshipit-source-id: 62ba09cf0725f29692fac71bc30173469283390b
2018-10-25 17:26:25 -07:00
Edward Yang
5b971445a6 Typo fix (#12826)
Summary:
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12826

Differential Revision: D10449047

Pulled By: ezyang

fbshipit-source-id: eb10aa5886339b43bb8c239dd8742e458f3d024d
2018-10-18 11:36:00 -07:00
Yangqing Jia
7788ec9dd1 Remove dangling cmake check for long typemeta (#12356)
Summary:
TSIA
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12356

Differential Revision: D10212726

Pulled By: Yangqing

fbshipit-source-id: b9c2c778fb496278477ef323ecfefd5d19d1af3c
2018-10-05 09:43:32 -07:00
Junjie Bai
157fb46ffc Add -rdynamic only to linker flags to avoid compiler warnings (#10789)
Summary:
`clang: warning: argument unused during compilation: '-rdynamic'`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10789

Reviewed By: houseroad

Differential Revision: D9467385

Pulled By: bddppq

fbshipit-source-id: 610550a8f34cfa66b9dfa183752eb129dae21eaa
2018-08-27 17:56:21 -07:00
Junjie Bai
ba5d33bede Re-Enable ATen in C2 in integration builds to test ONNX ATen conversions
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10060

Differential Revision: D9081387

Pulled By: bddppq

fbshipit-source-id: 13cbff63df5241e013d4ebacfcd6da082e7196f6
2018-07-31 15:27:05 -07:00
Junjie Bai
f779202711 Correctly set CAFFE2_DISABLE_NUMA when USE_NUMA=OFF in cmake (#10061)
Summary:
previously https://github.com/pytorch/pytorch/blob/master/caffe2/core/numa.cc still gets compiled even when USE_NUMA=OFF
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10061

Reviewed By: houseroad

Differential Revision: D9081385

Pulled By: bddppq

fbshipit-source-id: ad28b647e0033727839770b1da0fba341b1b7787
2018-07-31 11:01:51 -07:00
Gregory Chanan
6fb9acfc16 Revert empty n-dim and ATen in C2 integration builds
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10064

Differential Revision: D9082082

Pulled By: gchanan

fbshipit-source-id: ae49470f5b4c89b13beb55fd825de1ba05b6a4fa
2018-07-31 07:25:56 -07:00
Junjie Bai
57750bd638 Enable ATen in C2 in integration builds to test ONNX ATen conversions (#10014)
Summary:
zrphercule
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10014

Reviewed By: houseroad

Differential Revision: D9061842

Pulled By: bddppq

fbshipit-source-id: 1e1c2aeae62dd2cc5c6a8d5e1d395ea5cf882734
2018-07-30 15:01:13 -07:00
Yangqing Jia
1a03ba51dc
[cmake] Add and export Modules_CUDA_fix (#8271)
* Add and export Modules_CUDA_fix

* actually, need to include before finding cuda
2018-06-07 21:50:30 -07:00
Orion Reblitz-Richardson
4bf0202cac
[build] Have PyTorch depend on minimal libcaffe2.so instead of libATen.so (#7399)
* Have PyTorch depend on minimal libcaffe2.so instead of libATen.so

* Build ATen tests as a part of Caffe2 build

* Hopefully cufft and nvcc fPIC fixes

* Make ATen install components optional

* Add tests back for ATen and fix TH build

* Fixes for test_install.sh script

* Fixes for cpp_build/build_all.sh

* Fixes for aten/tools/run_tests.sh

* Switch ATen cmake calls to USE_CUDA instead of NO_CUDA

* Attempt at fix for aten/tools/run_tests.sh

* Fix typo in last commit

* Fix valgrind call after pushd

* Be forgiving about USE_CUDA disable like PyTorch

* More fixes on the install side

* Link all libcaffe2 during test run

* Make cuDNN optional for ATen right now

* Potential fix for non-CUDA builds

* Use NCCL_ROOT_DIR environment variable

* Pass -fPIC through nvcc to base compiler/linker

* Remove THCUNN.h requirement for libtorch gen

* Add Mac test for -Wmaybe-uninitialized

* Potential Windows and Mac fixes

* Move MSVC target props to shared function

* Disable cpp_build/libtorch tests on Mac

* Disable sleef for Windows builds

* Move protos under BUILD_CAFFE2

* Remove space from linker flags passed with -Wl

* Remove ATen from Caffe2 dep libs since directly included

* Potential Windows fixes

* Preserve options while sleef builds

* Force BUILD_SHARED_LIBS flag for Caffe2 builds

* Set DYLD_LIBRARY_PATH and LD_LIBRARY_PATH for Mac testing

* Pass TORCH_CUDA_ARCH_LIST directly in cuda.cmake

* Fixes for the last two changes

* Potential fix for Mac build failure

* Switch Caffe2 to build_caffe2 dir to not conflict

* Cleanup FindMKL.cmake

* Another attempt at Mac cpp_build fix

* Clear cpp-build directory for Mac builds

* Disable test in Mac build/test to match cmake
2018-05-24 07:47:27 -07:00
Edward Z. Yang
ed9952dd25
Update FindCUDA to cmake master as of 561238bb6f07a5ab31293928bd98f6f… (#6241)
* Update FindCUDA to cmake master as of 561238bb6f07a5ab31293928bd98f6f8911d8bc1

NB: I DID have to apply one local patch; it's the `include_guard` change. Should
be obvious next time you do an update.

Relevant commits:

    commit 23119366e9d4e56e13c1fdec9dbff5e8f8c55ee5
    Author: Edward Z. Yang <ezyang@fb.com>
    Date:   Wed Mar 28 11:33:56 2018 -0400

        FindCUDA: Make nvcc configurable via CUDA_NVCC_EXECUTABLE env var

        This is useful if, for example, you want ccache to be used
        for nvcc.  With the current behavior, cmake always picks up
        /usr/local/cuda/bin/nvcc, even if there is a ccache nvcc
        stub in the PATH.  Allowing for CUDA_NVCC_EXECUTABLE lets
        us work around the problem.

        Signed-off-by: Edward Z. Yang <ezyang@fb.com>

    commit e743fc8e9137692232f0220ac901f5a15cbd62cf
    Author: Henry Fredrick Schreiner <henry.fredrick.schreiner@cern.ch>
    Date:   Thu Mar 15 15:30:50 2018 +0100

        FindCUDA/select_compute_arch: Add support for CUDA as a language

        Even though this is an internal module, we can still prepare it to
        be used in another public-facing module outside of `FindCUDA`.

        Issue: #16586

    commit 193082a3c803a6418f0f1b5976dc34a91cf30805
    Author: luz.paz <luzpaz@users.noreply.github.com>
    Date:   Thu Feb 8 06:27:21 2018 -0500

        MAINT: Misc. typos

        Found via `codespell -q 3 -I ../cmake-whitelist.txt`.

    commit 9f74aaeb7d6649241c4a478410e87d092c462960
    Author: Brad King <brad.king@kitware.com>
    Date:   Tue Jan 30 08:18:11 2018 -0500

        FindCUDA: Fix regression in per-config flags

        Changes in commit 48f7e2d300 (Unhardcode the CMAKE_CONFIGURATION_TYPES
        values, 2017-11-27) accidentally left `CUDA_configuration_types`
        undefined, but this is used in a few places to handle per-config flags.
        Restore it.

        Fixes: #17671

    commit d91b2d9158cbe5d65bfcc8f7512503d7f226ad91
    Author: luz.paz <luzpaz@users.noreply.github.com>
    Date:   Wed Jan 10 12:34:14 2018 -0500

        MAINT: Misc. typos

        Found via `codespell`

    commit d08f3f551fa94b13a1d43338eaed68bcecb95cff
    Merge: 1be22978e 1f4d7a071
    Author: Brad King <brad.king@kitware.com>
    Date:   Wed Jan 10 15:34:57 2018 +0000

        Merge topic 'unhardcode-configuration-types'

        1f4d7a07 Help: Add references and backticks in LINK_FLAGS prop_tgt
        48f7e2d3 Unhardcode the CMAKE_CONFIGURATION_TYPES values

        Acked-by: Kitware Robot <kwrobot@kitware.com>
        Merge-request: !1345

    commit 5fbfa18fadf945963687cd95627c1bc62b68948a
    Merge: bc88329e5 ff41a4b81
    Author: Brad King <brad.king@kitware.com>
    Date:   Tue Jan 9 14:26:35 2018 +0000

        Merge topic 'FindCUDA-deduplicate-c+std-host-flags'

        ff41a4b8 FindCUDA: de-duplicates C++11 flag when propagating host flags.

        Acked-by: Kitware Robot <kwrobot@kitware.com>
        Merge-request: !1628

    commit bc88329e5ba7b1a14538f23f4fa223ac8d6d5895
    Merge: 89d127463 fab1b432e
    Author: Brad King <brad.king@kitware.com>
    Date:   Tue Jan 9 14:26:16 2018 +0000

        Merge topic 'msvc2017-findcuda'

        fab1b432 FindCUDA: Update to properly find MSVC 2017 compiler tools

        Acked-by: Kitware Robot <kwrobot@kitware.com>
        Acked-by: Robert Maynard <robert.maynard@kitware.com>
        Merge-request: !1631

    commit 48f7e2d30000dc57c31d3e3ab81077950704a587
    Author: Beren Minor <beren.minor+git@gmail.com>
    Date:   Mon Nov 27 19:22:11 2017 +0100

        Unhardcode the CMAKE_CONFIGURATION_TYPES values

        This removes duplicated code for per-config variable initialization by
        providing a `cmake_initialize_per_config_variable(<PREFIX> <DOCSTRING>)`
        function.

        This function initializes a `<PREFIX>` cache variable from `<PREFIX>_INIT`
        and unless the `CMAKE_NOT_USING_CONFIG_FLAGS` variable is defined, does
        the same with `<PREFIX>_<CONFIG>` from `<PREFIX>_<CONFIG>_INIT` for every
        `<CONFIG>` in `CMAKE_CONFIGURATION_TYPES` for multi-config generators or
        `CMAKE_BUILD_TYPE` for single-config generators.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Polyfill CMakeInitializeConfigs

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Tweak condition for when to use bundled FindCUDA support.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Comment out include_guard.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2018-04-04 17:04:21 -04:00
Orion Reblitz-Richardson
e881efde79 Use local FindCUDA for CMake < 3.7 2018-03-28 10:05:20 -07:00
Orion Reblitz-Richardson
3a84574c81 Update CAFFE2_LINK_LOCAL_PROTOBUF functionality.
* Continuation of https://github.com/caffe2/caffe2/pull/2306 and based on Yangqing's PR at https://github.com/caffe2/caffe2/pull/2326
* Put caffe2_protos as static library and link it whole to libcaffe2.so
* For protobuf::libprotobuf, only link it to libcaffe2_protos (and hence libcaffe2.so), but not any downstream library. This avoids manipulating protobuf objects across dll boundaries.
* After the above, during linking one will receive complaint that fixed_address_empty_string is not found. This is because we compiled protobuf with hidden visibility, and the fact that the generated caffe2.pb.h has an inline function that invokes the inline function in protobuf GetEmptyStringAlreadyInited()
* Added sed-like commands to replace the generated header to use caffe2::GetEmptyStringAlreadyInited() instead. And, in proto_utils.cc, implement a function that essentially routes the function call to protobuf's internal one. The reason this works is that, caffe2::G... is visible globally, and libcaffe2.so is able to see the real protobuf one. This ensures that we are always calling protobuf functions that are inside libcaffe2.so.
2018-03-28 10:05:20 -07:00