Commit Graph

49 Commits

Author SHA1 Message Date
cyy
1433bc1455 Remove CAFFE2_USE_EXCEPTION_PTR (#147247)
The check is for older compilers and is now aways true.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/147247
Approved by: https://github.com/janeyx99
2025-03-06 02:56:23 +00:00
Nikita Shulga
8f5ce865a4 [Build] Add COMMIT_SHA to caffe2::GetBuildOptions (#141313)
Using the same `tools/generate_torch_version.py` script

It's already available on Python level, but not on C++ one

Please note, that updating commit hash will force recompilation of less than 10 files according to
```
% touch caffe2/core/macros.h; ninja -d explain -j1 -v -n torch_python
ninja explain: output caffe2/torch/CMakeFiles/gen_torch_version doesn't exist
ninja explain: caffe2/torch/CMakeFiles/gen_torch_version is dirty
ninja explain: /Users/malfet/git/pytorch/pytorch/torch/version.py is dirty
ninja explain: output third_party/kineto/libkineto/CMakeFiles/libkineto_defs.bzl of phony edge with no inputs doesn't exist
ninja explain: third_party/kineto/libkineto/CMakeFiles/libkineto_defs.bzl is dirty
ninja explain: output caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/Version.cpp.o older than most recent input /Users/malfet/git/pytorch/pytorch/build/caffe2/core/macros.h (1732301546390618881 vs 1732301802196214000)
ninja explain: caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/Version.cpp.o is dirty
ninja explain: output caffe2/CMakeFiles/torch_cpu.dir/core/common.cc.o older than most recent input /Users/malfet/git/pytorch/pytorch/build/caffe2/core/macros.h (1732301546233600752 vs 1732301802196214000)
ninja explain: caffe2/CMakeFiles/torch_cpu.dir/core/common.cc.o is dirty
ninja explain: output caffe2/CMakeFiles/torch_cpu.dir/serialize/inline_container.cc.o older than most recent input /Users/malfet/git/pytorch/pytorch/build/caffe2/core/macros.h (1732301546651089243 vs 1732301802196214000)
ninja explain: caffe2/CMakeFiles/torch_cpu.dir/serialize/inline_container.cc.o is dirty
ninja explain: output caffe2/CMakeFiles/torch_cpu.dir/serialize/file_adapter.cc.o older than most recent input /Users/malfet/git/pytorch/pytorch/build/caffe2/core/macros.h (1732301546224176845 vs 1732301802196214000)
ninja explain: caffe2/CMakeFiles/torch_cpu.dir/serialize/file_adapter.cc.o is dirty
ninja explain: output caffe2/CMakeFiles/torch_cpu.dir/utils/threadpool/ThreadPool.cc.o older than most recent input /Users/malfet/git/pytorch/pytorch/build/caffe2/core/macros.h (1732301546464535054 vs 1732301802196214000)
ninja explain: caffe2/CMakeFiles/torch_cpu.dir/utils/threadpool/ThreadPool.cc.o is dirty
ninja explain: output caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/static/impl.cpp.o older than most recent input /Users/malfet/git/pytorch/pytorch/build/caffe2/core/macros.h (1732301550062608920 vs 1732301802196214000)
ninja explain: caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/static/impl.cpp.o is dirty
ninja explain: output caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/mps/MPSFallback.mm.o older than most recent input /Users/malfet/git/pytorch/pytorch/build/caffe2/core/macros.h (1732301547538843492 vs 1732301802196214000)
ninja explain: caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/mps/MPSFallback.mm.o is dirty
```

Differential Revision: [D66468257](https://our.internmc.facebook.com/intern/diff/D66468257)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/141313
Approved by: https://github.com/ezyang
2024-11-26 00:09:36 +00:00
cyyever
c638a40a93 [Caffe2] Remove unused AVX512 code (#133160)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133160
Approved by: https://github.com/albanD
2024-08-23 23:16:16 +00:00
PyTorch MergeBot
fa1d7b0262 Revert "Remove unused Caffe2 macros (#132979)"
This reverts commit da65cfbdea.

Reverted https://github.com/pytorch/pytorch/pull/132979 on behalf of https://github.com/ezyang due to these are apparently load bearing internally ([comment](https://github.com/pytorch/pytorch/pull/132979#issuecomment-2284666332))
2024-08-12 18:34:56 +00:00
cyy
da65cfbdea Remove unused Caffe2 macros (#132979)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132979
Approved by: https://github.com/ezyang
2024-08-09 04:48:20 +00:00
cyy
574ae9afb8 [Submodule] Remove third-party onnx-tensorrt (#126542)
It seems that tensorrt is not used by the C++ code, may be due to the removal of Caffe2.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126542
Approved by: https://github.com/ezyang
2024-05-19 22:34:24 +00:00
Nikita Shulga
7ca6e0d38f [EZ] Add CUSPARSELT to build variables (#116213)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/116213
Approved by: https://github.com/Skylion007, https://github.com/kit1980, https://github.com/atalman
ghstack dependencies: #116212
2023-12-21 01:02:11 +00:00
Nikita Shulga
74119a3482 [EZ] Fix typo in USE_GLOO var (#116212)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/116212
Approved by: https://github.com/Skylion007, https://github.com/kit1980
2023-12-21 01:02:11 +00:00
hongxyan
66a76516bf [ROCm] Disabling Kernel Asserts for ROCm by default - fix and clean up and refactoring (#114660)
Related to #103973  #110532 #108404 #94891

**Context:**
As commented in 6ae0554d11/cmake/Dependencies.cmake (L1198)
Kernel asserts are enabled by default for CUDA and disabled for ROCm.
However it is somewhat broken, and Kernel assert was still enabled for ROCm.

Disabling kernel assert is also needed for users who do not have PCIe atomics support. These community users have verified that disabling the kernel assert in PyTorch/ROCm platform fixed their pytorch workflow, like torch.sum script, stable-diffusion. (see the related issues)

**Changes:**

This pull request serves the following purposes:
* Refactor and clean up the logic,  make it simpler for ROCm to enable and disable Kernel Asserts
* Fix the bug that Kernel Asserts for ROCm was not disabled by default.

Specifically,
- Renamed `TORCH_DISABLE_GPU_ASSERTS` to `C10_USE_ROCM_KERNEL_ASSERT` for the following reasons:
(1) This variable only applies to ROCm.
(2) The new name is more align with #define CUDA_KERNEL_ASSERT function.
(3) With USE_ in front of the name, we can easily control it with environment variable to turn on and off this feature during build (e.g. `USE_ROCM_KERNEL_ASSERT=1 python setup.py develop` will enable kernel assert for ROCm build).
- Get rid of the `ROCM_FORCE_ENABLE_GPU_ASSERTS' to simplify the logic and make it easier to understand and maintain
- Added `#cmakedefine` to carry over the CMake variable to C++

**Tests:**
(1) build with default mode and verify that USE_ROCM_KERNEL_ASSERT  is OFF(0), and kernel assert is disabled:

```
python setup.py develop
```
Verify CMakeCache.txt has correct value.
```
/xxxx/pytorch/build$ grep USE_ROCM_KERNEL_ASSERT CMakeCache.txt
USE_ROCM_KERNEL_ASSERT:BOOL=0
```
Tested the following code in ROCm build and CUDA build, and expected the return code differently.

```
subprocess.call([sys.executable, '-c', "import torch;torch._assert_async(torch.tensor(0,device='cuda'));torch.cuda.synchronize()"])
```
This piece of code is adapted from below unit test to get around the limitation that this unit test now was skipped for ROCm. (We will check to enable this unit test in the future)

```
python test/test_cuda_expandable_segments.py -k test_fixed_cuda_assert_async
```

Ran the following script, expecting r ==0 since the CUDA_KERNEL_ASSERT is defined as nothing:
```
>> import sys
>>> import subprocess
>>> r=subprocess.call([sys.executable, '-c', "import torch;torch._assert_async(torch.tensor(0,device='cuda'));torch.cuda.synchronize()"])
>>> r
0
```

(2) Enable the kernel assert by building with USE_ROCM_KERNEL_ASSERT=1, or USE_ROCM_KERNEL_ASSERT=ON
```
USE_ROCM_KERNEL_ASSERT=1 python setup.py develop
```

Verify `USE_ROCM_KERNEL_ASSERT` is `1`
```
/xxxx/pytorch/build$ grep USE_ROCM_KERNEL_ASSERT CMakeCache.txt
USE_ROCM_KERNEL_ASSERT:BOOL=1
```

Run the assert test, and expected return code not equal to 0.

```
>> import sys
>>> import subprocess
>>> r=subprocess.call([sys.executable, '-c', "import torch;torch._assert_async(torch.tensor(0,device='cuda'));torch.cuda.synchronize()"])
>>>/xxxx/pytorch/aten/src/ATen/native/hip/TensorCompare.hip:108: _assert_async_cuda_kernel: Device-side assertion `input[0] != 0' failed.
:0:rocdevice.cpp            :2690: 2435301199202 us: [pid:206019 tid:0x7f6cf0a77700] Callback: Queue 0x7f64e8400000 aborting with error : HSA_STATUS_ERROR_EXCEPTION: An HSAIL operation resulted in a hardware exception. code: 0x1016

>>> r
-6
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114660
Approved by: https://github.com/jeffdaily, https://github.com/malfet, https://github.com/jithunnair-amd
2023-12-13 15:44:53 +00:00
mikey dagitses
5f5d675587 remove unused CAFFE2_VERSION macros (#97337)
remove unused CAFFE2_VERSION macros

Summary:
Nothing reads these and they are completely subsumed by TORCH_VERSION.

Getting rid of these will be helpful for build unification, since they
are also not used internally.

Test Plan: Rely on CI.

Reviewers: sahanp

Subscribers:

Tasks:

Tags:

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97337
Approved by: https://github.com/malfet
2023-03-24 16:02:35 +00:00
Pruthvi Madugundu
fbd08fb358 Introduce TORCH_DISABLE_GPU_ASSERTS (#84190)
- Asserts for CUDA are enabled by default
- Disabled for ROCm by default by setting `TORCH_DISABLE_GPU_ASSERTS` to `ON`
- Can be enabled for ROCm by setting above variable to`OFF` during build or can be forcefully enabled by setting `ROCM_FORCE_ENABLE_GPU_ASSERTS:BOOL=ON`

This is follow up changes as per comment in PR #81790, comment [link](https://github.com/pytorch/pytorch/pull/81790#issuecomment-1215929021)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/84190
Approved by: https://github.com/jeffdaily, https://github.com/malfet
2022-11-04 04:43:05 +00:00
PyTorch MergeBot
0fa23663cc Revert "Introduce TORCH_DISABLE_GPU_ASSERTS (#84190)"
This reverts commit 1e2c4a6e0e.

Reverted https://github.com/pytorch/pytorch/pull/84190 on behalf of https://github.com/malfet due to Needs internal changes, has to be landed via co-dev
2022-11-02 18:13:37 +00:00
Pruthvi Madugundu
1e2c4a6e0e Introduce TORCH_DISABLE_GPU_ASSERTS (#84190)
- Asserts for CUDA are enabled by default
- Disabled for ROCm by default by setting `TORCH_DISABLE_GPU_ASSERTS` to `ON`
- Can be enabled for ROCm by setting above variable to`OFF` during build or can be forcefully enabled by setting `ROCM_FORCE_ENABLE_GPU_ASSERTS:BOOL=ON`

This is follow up changes as per comment in PR #81790, comment [link](https://github.com/pytorch/pytorch/pull/81790#issuecomment-1215929021)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/84190
Approved by: https://github.com/jeffdaily, https://github.com/malfet
2022-11-02 17:41:57 +00:00
zhang, xiaobing
86b86202b5 fix torch.config can't respect USE_MKLDNN flag issue (#75001)
Fixes https://github.com/pytorch/pytorch/issues/74949, which reports that torch.config can't respect USE_MKLDNN flag.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75001
Approved by: https://github.com/malfet
2022-07-17 15:00:48 +00:00
Jing Xu
3c7044728b Enable Intel® VTune™ Profiler's Instrumentation and Tracing Technology APIs (ITT) to PyTorch (#63289)
More detailed description of benefits can be found at #41001. This is Intel's counterpart of NVidia’s NVTX (https://pytorch.org/docs/stable/autograd.html#torch.autograd.profiler.emit_nvtx).

ITT is a functionality for labeling trace data during application execution across different Intel tools.
For integrating Intel(R) VTune Profiler into Kineto, ITT needs to be integrated into PyTorch first. It works with both standalone VTune Profiler [(https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html](https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html)) and Kineto-integrated VTune functionality in the future.
It works for both Intel CPU and Intel XPU devices.

Pitch
Add VTune Profiler's ITT API function calls to annotate PyTorch ops, as well as developer customized code scopes on CPU, like NVTX for NVidia GPU.

This PR rebases the code changes at https://github.com/pytorch/pytorch/pull/61335 to the latest master branch.

Usage example:
```
with torch.autograd.profiler.emit_itt():
    for i in range(10):
        torch.itt.range_push('step_{}'.format(i))
        model(input)
        torch.itt.range_pop()
```

cc @ilia-cher @robieta @chaekit @gdankel @bitfort @ngimel @orionr @nbcsm @guotuofeng @guyang3532 @gaoteng-git
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63289
Approved by: https://github.com/malfet
2022-07-13 13:50:15 +00:00
PyTorch MergeBot
1454515253 Revert "Enable Intel® VTune™ Profiler's Instrumentation and Tracing Technology APIs (ITT) to PyTorch (#63289)"
This reverts commit f988aa2b3f.

Reverted https://github.com/pytorch/pytorch/pull/63289 on behalf of https://github.com/malfet due to broke trunk, see f988aa2b3f
2022-06-30 12:49:41 +00:00
Jing Xu
f988aa2b3f Enable Intel® VTune™ Profiler's Instrumentation and Tracing Technology APIs (ITT) to PyTorch (#63289)
More detailed description of benefits can be found at #41001. This is Intel's counterpart of NVidia’s NVTX (https://pytorch.org/docs/stable/autograd.html#torch.autograd.profiler.emit_nvtx).

ITT is a functionality for labeling trace data during application execution across different Intel tools.
For integrating Intel(R) VTune Profiler into Kineto, ITT needs to be integrated into PyTorch first. It works with both standalone VTune Profiler [(https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html](https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html)) and Kineto-integrated VTune functionality in the future.
It works for both Intel CPU and Intel XPU devices.

Pitch
Add VTune Profiler's ITT API function calls to annotate PyTorch ops, as well as developer customized code scopes on CPU, like NVTX for NVidia GPU.

This PR rebases the code changes at https://github.com/pytorch/pytorch/pull/61335 to the latest master branch.

Usage example:
```
with torch.autograd.profiler.emit_itt():
    for i in range(10):
        torch.itt.range_push('step_{}'.format(i))
        model(input)
        torch.itt.range_pop()
```

cc @ilia-cher @robieta @chaekit @gdankel @bitfort @ngimel @orionr @nbcsm @guotuofeng @guyang3532 @gaoteng-git
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63289
Approved by: https://github.com/malfet
2022-06-30 05:14:03 +00:00
Pruthvi Madugundu
085e2f7bdd [ROCm] Changes not to rely on CUDA_VERSION or HIP_VERSION (#65610)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65610

- Replace HIP_PLATFORM_HCC with USE_ROCM
- Dont rely on CUDA_VERSION or HIP_VERSION and use USE_ROCM and ROCM_VERSION.

- In the next PR
   - Will be removing the mapping from CUDA_VERSION to HIP_VERSION and CUDA to HIP in hipify.
   - HIP_PLATFORM_HCC is deprecated, so will add HIP_PLATFORM_AMD to support HIP host code compilation on gcc.

cc jeffdaily sunway513 jithunnair-amd ROCmSupport amathews-amd

Reviewed By: jbschlosser

Differential Revision: D30909053

Pulled By: ezyang

fbshipit-source-id: 224a966ebf1aaec79beccbbd686fdf3d49267e06
2021-09-29 09:55:43 -07:00
Ailing Zhang
621198978a Move USE_NUMPY to more appropriate targets (#51143)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51143

Test Plan: CI

Reviewed By: wconstab

Differential Revision: D26084123

fbshipit-source-id: af4abe4ef87c1ebe5434938320526a925f5c34c8
2021-01-27 15:44:12 -08:00
Rong Rong
54022e4f9b add new build settings to torch.__config__ (#48380)
Summary:
many newly added build settings are not saved in torch.__config__. adding them to the mix.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/48380

Reviewed By: samestep

Differential Revision: D25161951

Pulled By: walterddr

fbshipit-source-id: 1d3dee033c93f2d1a7e2a6bcaf88aedafeac8d31
2020-12-01 14:16:36 -08:00
Jiakai Liu
3a0e35c9f2 [pytorch] deprecate static dispatch (#43564)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43564

Static dispatch was originally introduced for mobile selective build.

Since we have added selective build support for dynamic dispatch and
tested it in FB production for months, we can deprecate static dispatch
to reduce the complexity of the codebase.

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D23324452

Pulled By: ljk53

fbshipit-source-id: d2970257616a8c6337f90249076fca1ae93090c7
2020-08-27 14:52:48 -07:00
Hong Xu
daf00beaba Remove duplicated Numa detection code. (#30628)
Summary:
cmake/Dependencies.cmake (1111a6b810/cmake/Dependencies.cmake (L595-L609)) has already detected Numa. Duplicated detection and variables may lead to
incorrect results.

Close https://github.com/pytorch/pytorch/issues/29968
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30628

Differential Revision: D18782479

Pulled By: ezyang

fbshipit-source-id: f74441f03367f11af8fa59b92d656c6fa070fbd0
2020-01-03 08:48:46 -08:00
Richard Zou
9047d4df45 Remove all remaining usages of BUILD_NAMEDTENSOR (#31116)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31116

Changelist:
- remove BUILD_NAMEDTENSOR macro
- remove torch._C._BUILD_NAMEDTENSOR
- remove all python behavior that relies on torch._C._BUILD_NAMEDTENSOR

Future:
- In the next diff, I will remove all usages of
ATen/core/EnableNamedTensor.h since that header doesn't do anything
anymore
- After that, we'll be done with the BUILD_NAMEDTENSOR removal.

Test Plan: - run CI

Differential Revision: D18934951

Pulled By: zou3519

fbshipit-source-id: 0a0df0f1f0470d0a01c495579333a2835aac9f5d
2019-12-12 09:53:03 -08:00
Lucian Grijincu
9c9f14029d Revert D16929363: Revert D16048264: Add static dispatch mode to reduce mobile code size
Differential Revision:
D16929363

Original commit changeset: 69d302929e18

fbshipit-source-id: add36a6047e4574788eb127c40f6166edeca705f
2019-08-20 17:08:31 -07:00
Lucian Grijincu
bd6cf5099b Revert D16048264: Add static dispatch mode to reduce mobile code size
Differential Revision:
D16048264

Original commit changeset: ad1e50951273

fbshipit-source-id: 69d302929e183e2da26b64dcc24c69c3b7de186b
2019-08-20 16:26:18 -07:00
Roy Li
6824c9018d Add static dispatch mode to reduce mobile code size
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22335

Test Plan: Imported from OSS

Differential Revision: D16048264

Pulled By: li-roy

fbshipit-source-id: ad1e50951273962a51bac7c25c3d2e5a588a730e
2019-08-20 12:21:32 -07:00
Hong Xu
693871ded3 Rename macros and build options NAMEDTENSOR_ENABLED to BUILD_NAMEDTENSOR (#22360)
Summary:
Currently the build system accepts USE_NAMEDTENSOR from the environment
variable and turns it into NAMEDTENSOR_ENABLED when passing to CMake.
This discrepancy does not seem necessary and complicates the build
system. The naming of this build option is also semantically incorrect
("BUILD_" vis-a-vis "USE_").  This commit eradicate this issue before it
is made into a stable release.

The support of NO_NAMEDTENSOR is also removed, since PyTorch has been
quite inconsistent about "NO_*" build options.

 ---

Note: All environment variables with their names starting with `BUILD_` are currently automatically passed to CMake with no need of an additional wrapper.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22360

Differential Revision: D16074509

Pulled By: zou3519

fbshipit-source-id: dc316287e26192118f3c99b945454bc50535b2ae
2019-07-02 11:46:13 -07:00
Richard Zou
e01a5bf28b Add USE_NAMEDTENSOR compilation flag. (#20162)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20162
ghimport-source-id: 0efcd67f04aa087e1dd5faeee550daa2f13ef1a5

Reviewed By: gchanan

Differential Revision: D15278211

Pulled By: zou3519

fbshipit-source-id: 6fee981915d83e820fe8b50a8f59da22a428a9bf
2019-05-09 09:09:16 -07:00
Jerry Zhang
0c32e1b43e use C10_MOBILE/ANDROID/IOS (#15363)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15363

Didn't define C10_MOBILE in the numa file move diff: D13380559
move CAFFE2_MOBILE/ANDROID/IOS to c10

```
codemod -m -d caffe2 --extensions h,hpp,cc,cpp,mm "CAFFE2_MOBILE" "C10_MOBILE"
codemod -m -d caffe2 --extensions h,hpp,cc,cpp,mm "CAFFE2_ANDROID" "C10_ANDROID"
codemod -m -d caffe2 --extensions h,hpp,cc,cpp,mm "CAFFE2_IOS" "C10_IOS"

```

i-am-not-moving-c2-to-c10

Reviewed By: marcinkwiatkowski

Differential Revision: D13490020

fbshipit-source-id: c4f01cacbefc0f16d5de94155c26c92fd5d780e4
2019-01-09 15:08:20 -08:00
Jerry Zhang
63e77ab6c4 Move numa.{h, cc} to c10/util (#15024)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15024

Pull Request resolved: https://github.com/pytorch/pytorch/pull/14393

att

Reviewed By: dzhulgakov

Differential Revision: D13380559

fbshipit-source-id: abc3fc7321cf37323f756dfd614c7b41978734e4
2018-12-12 12:21:10 -08:00
Yudong Guang
265b55d028 Revert D13205604: Move numa.{h, cc} to c10/util
Differential Revision:
D13205604

Original commit changeset: 54166492d318

fbshipit-source-id: 89b6833518c0b554668c88ae38d97fbc47e2de17
2018-12-07 10:01:25 -08:00
Jerry Zhang
1d111853ae Move numa.{h, cc} to c10/util (#14393)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14393

att

Reviewed By: ezyang

Differential Revision: D13205604

fbshipit-source-id: 54166492d31827b0343ed070cc36a825dd86e2ed
2018-12-06 11:30:13 -08:00
Jongsoo Park
b5181ba1df add avx512 option (but no avx512 kernel yet) (#14664)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14664

This diff just adds a framework to add avx512 kernels.
Please be really really careful about using avx512 kernels unless you're convinced using avx512 will bring good enough *overall* speedups because it can backfire because of cpu frequency going down.

Reviewed By: duc0

Differential Revision: D13281944

fbshipit-source-id: 04fce8619c63f814944b727a99fbd7d35538eac6
2018-12-03 12:18:19 -08:00
Gu, Jinghui
dbab9b73b6 seperate mkl, mklml, and mkldnn (#12170)
Summary:
1. Remove avx2 support in mkldnn
2. Seperate mkl, mklml, and mkldnn
3. Fix convfusion test case
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12170

Reviewed By: yinghai

Differential Revision: D10207126

Pulled By: orionr

fbshipit-source-id: 1e62eb47943f426a89d57e2d2606439f2b04fd51
2018-10-29 10:52:55 -07:00
vishwakftw
39bd73ae51 Guard NumPy usage using USE_NUMPY (#11798)
Summary:
All usages of the `ndarray` construct have now been guarded with `USE_NUMPY`. This eliminates the requirement of NumPy while building PyTorch from source.

Fixes #11757

Reviewed By: Yangqing

Differential Revision: D10031862

Pulled By: SsnL

fbshipit-source-id: 32d84fd770a7714d544e2ca1895a3d7c75b3d712
2018-10-04 12:11:02 -07:00
Yangqing Jia
1962646d0f Remove CAFFE2_UNIQUE_LONG_TYPEMETA (#12311)
Summary:
CAFFE2_UNIQUE_LONG_TYPEMETA has been a tricky variable defined only from cmake - this is an experiment to remove it and see what exact compilers need that one set.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12311

Reviewed By: dzhulgakov

Differential Revision: D10187777

Pulled By: Yangqing

fbshipit-source-id: 03e4ede4eafc291e947e0449382bc557cb624b34
2018-10-04 10:12:13 -07:00
Yangqing Jia
38f3d1fc40 move flags to c10 (#12144)
Summary:
still influx.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12144

Reviewed By: smessmer

Differential Revision: D10140176

Pulled By: Yangqing

fbshipit-source-id: 1a313abed022039333e3925d19f8b3ef2d95306c
2018-10-04 02:09:56 -07:00
Edward Yang
fcb3ccf23f Don't record Git version automatically via cmake (#12046)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12046

This /sounds/ like a good idea in theory, but a feature
like this must be implemented very carefully, because if
you just plop the Git version in a header (that is included
by every file in your project, as macros.h is), then every
time you do a 'git pull', you will do a FULL rebuild, because
macros.h is going to regenerate to a new version and of course
you have to rebuild a source file if a header file changes.

I don't have time to implement it correctly, so I'm axing
the feature instead. If you want git versions in, e.g.,
nightly builds, please explicitly specify that when you feed
in the version.

Reviewed By: pjh5

Differential Revision: D10030556

fbshipit-source-id: 499d001c7b8ccd4ef15ce10dd6591c300c7df27d
2018-09-25 09:40:19 -07:00
Yangqing Jia
20c516ac18
[cmake] Make cudnn optional (#8265)
* Make cudnn optional

* Remove cudnn file from cpu file
2018-06-08 02:04:27 -07:00
Orion Reblitz-Richardson
24b41da795
[build] Make ATen buildable without all Caffe2 by root cmake (#7295)
* Make ATen buildable without all Caffe2 by root cmake

* Fix typo in aten cmake

* Set BUILD_ATEN from USE_ATEN as compat

* Only set BUILD_ATEN from USE_ATEN when on

* Have USE_GLOO only set when BUILD_CAFFE2
2018-05-08 10:24:04 -07:00
Jinghui
26ddefbda1 [feature request] [Caffe2] Enable MKLDNN support for inference (#6699)
* Add operators based-on IDEEP interfaces

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

* Enable IDEEP as a caffe2 device

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

* Add test cases for IDEEP ops

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

* Add IDEEP as a caffe2 submodule

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

* Skip test cases if no IDEEP support

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

* Correct cmake options for IDEEP

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

* Add dependences on ideep libraries

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

* Fix issues in IDEEP conv ops and etc.

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

* Move ideep from caffe2/ideep to caffe2/contrib/ideep

Signed-off-by: Gu Jinghui <jinghui.gu@intel.com>

* Update IDEEP to fix cmake issue

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

* Fix cmake issue caused by USE_MKL option

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

* Correct comments in MKL cmake file

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>
2018-04-22 21:58:14 -07:00
Yinghai Lu
434f710f3f
[Caffe2] Add support to TensorRT (#6150)
* Add support to TensorRT

* Removed License header

* Bind input/output by position

* Comments

* More comments

* Add benchmark

* Add warning for performance degradation on large batch

* Address comments

* comments
2018-04-11 17:03:54 -07:00
Dmytro Dzhulgakov
08bb6ae8bb Fix OSS build
Fix OSS build broken after D6946982 by adding CMake detection variable
(https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-gcc4.9-ubuntu14.04-build/1343/console)
2018-03-06 00:33:11 -08:00
Kutta Srinivasan
72f259c84b Add C++ preprocessor define CAFFE2_USE_EXCEPTION_PTR to guard use of std::exception_ptr 2018-03-05 16:02:40 -08:00
Yangqing Jia
545c0937fb Making a module option for Caffe2
Summary:
This will help releasing models that are using Caffe2 but have their own operator implementations and extensions. More detailed docs to arrive later. Let's see what contbuild says.
Closes https://github.com/caffe2/caffe2/pull/1378

Differential Revision: D6155045

Pulled By: Yangqing

fbshipit-source-id: 657a4c8de2f8e095bad5ed5db5b3e476b2a877e1
2017-10-26 12:33:58 -07:00
Yangqing Jia
f0ca857e6b Explicitly use Eigen MPL2 in builds.
Summary:
Closes #1371
Closes https://github.com/caffe2/caffe2/pull/1372

Reviewed By: asaadaldien, houseroad, akyrola, bwasti

Differential Revision: D6125792

Pulled By: Yangqing

fbshipit-source-id: 5fd7ee9a5d77381fe9afbe899ef18465ecd1ceea
2017-10-23 15:06:38 -07:00
Dmytro Dzhulgakov
5527dd3b08 Expose CMake options in the binary
Summary:
Useful for figuring out with people which version they built with. We can just ask for --caffe2_version gflag or get core.build_options from python.

Also adds CMAKE_INSTALL_RPATH_USE_LINK_PATH - without it wasn't building on my Mac. How should it be tested?
Closes https://github.com/caffe2/caffe2/pull/1271

Reviewed By: bddppq

Differential Revision: D5940750

Pulled By: dzhulgakov

fbshipit-source-id: 45b4c94f67e79346a10a65b34f40fd258295dad1
2017-10-04 02:33:02 -07:00
Yangqing Jia
f14d75c7ef Proper versioning and misc CMake improvements
Summary:
This brings proper versioning in Caffe2: instead of manual version macros, this puts the version information in CMake (replacing the TODO bwasti line) and uses macros.h.in to then generate the version in the C++ header.

A few misc updates:
- Removed the mac os rpath, verified on local macbook that it is no longer needed.
- Misc updates for caffe2 ready:
  - Mapped cmake/Cuda.cmake with gloo's setting.
  - upstreamed third_party/nccl so it builds with cuda 9.
- Separated the Caffe2 cpu dependencies and cuda dependencies
  - now libCaffe2_CPU.so do not depend on any cuda libs.
  - caffe2 python extensions now depend on cpu and gpu separately too.
- Reduced the number of unused functions in Utils.cmake
Closes https://github.com/caffe2/caffe2/pull/1256

Reviewed By: dzhulgakov

Differential Revision: D5899210

Pulled By: Yangqing

fbshipit-source-id: 36366e47366c3258374d646cf410b5f49f95767b
2017-09-26 08:52:21 -07:00
Luke Yeager
c1356216a2 cmake: generate macros.h with configure_file()
Summary:
Using file(WRITE) caused the file to be rewritten for every CMake
reconfigure, which was causing unnecessary full rebuilds of the project
even when no source files changed.

The new strategy has the added benefit of enforcing that the macros.h file
is always generated correctly. When the main project relies on this
header for macro definitions (instead of relying on add_definitions()),
we can be more confident that the project will build correctly when used
as a library (which is the whole point of the macros.h file).

Upsides:
* No more unnecessary rebuilds
* Higher confidence that the project will compile properly as a third-party library

Downsides:
* Developers need to add an entry to `macros.h.in` whenever they would have added a new definition with `add_definitions()`
Closes https://github.com/caffe2/caffe2/pull/1103

Differential Revision: D5680367

Pulled By: Yangqing

fbshipit-source-id: 4db29c28589efda1b6a3f5f88752e3984260a0f2
2017-08-22 14:22:36 -07:00