pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
cyy	1433bc1455	Remove CAFFE2_USE_EXCEPTION_PTR (#147247 ) The check is for older compilers and is now aways true. Pull Request resolved: https://github.com/pytorch/pytorch/pull/147247 Approved by: https://github.com/janeyx99	2025-03-06 02:56:23 +00:00
Nikita Shulga	8f5ce865a4	[Build] Add `COMMIT_SHA` to `caffe2::GetBuildOptions` (#141313 ) Using the same `tools/generate_torch_version.py` script It's already available on Python level, but not on C++ one Please note, that updating commit hash will force recompilation of less than 10 files according to ``` % touch caffe2/core/macros.h; ninja -d explain -j1 -v -n torch_python ninja explain: output caffe2/torch/CMakeFiles/gen_torch_version doesn't exist ninja explain: caffe2/torch/CMakeFiles/gen_torch_version is dirty ninja explain: /Users/malfet/git/pytorch/pytorch/torch/version.py is dirty ninja explain: output third_party/kineto/libkineto/CMakeFiles/libkineto_defs.bzl of phony edge with no inputs doesn't exist ninja explain: third_party/kineto/libkineto/CMakeFiles/libkineto_defs.bzl is dirty ninja explain: output caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/Version.cpp.o older than most recent input /Users/malfet/git/pytorch/pytorch/build/caffe2/core/macros.h (1732301546390618881 vs 1732301802196214000) ninja explain: caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/Version.cpp.o is dirty ninja explain: output caffe2/CMakeFiles/torch_cpu.dir/core/common.cc.o older than most recent input /Users/malfet/git/pytorch/pytorch/build/caffe2/core/macros.h (1732301546233600752 vs 1732301802196214000) ninja explain: caffe2/CMakeFiles/torch_cpu.dir/core/common.cc.o is dirty ninja explain: output caffe2/CMakeFiles/torch_cpu.dir/serialize/inline_container.cc.o older than most recent input /Users/malfet/git/pytorch/pytorch/build/caffe2/core/macros.h (1732301546651089243 vs 1732301802196214000) ninja explain: caffe2/CMakeFiles/torch_cpu.dir/serialize/inline_container.cc.o is dirty ninja explain: output caffe2/CMakeFiles/torch_cpu.dir/serialize/file_adapter.cc.o older than most recent input /Users/malfet/git/pytorch/pytorch/build/caffe2/core/macros.h (1732301546224176845 vs 1732301802196214000) ninja explain: caffe2/CMakeFiles/torch_cpu.dir/serialize/file_adapter.cc.o is dirty ninja explain: output caffe2/CMakeFiles/torch_cpu.dir/utils/threadpool/ThreadPool.cc.o older than most recent input /Users/malfet/git/pytorch/pytorch/build/caffe2/core/macros.h (1732301546464535054 vs 1732301802196214000) ninja explain: caffe2/CMakeFiles/torch_cpu.dir/utils/threadpool/ThreadPool.cc.o is dirty ninja explain: output caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/static/impl.cpp.o older than most recent input /Users/malfet/git/pytorch/pytorch/build/caffe2/core/macros.h (1732301550062608920 vs 1732301802196214000) ninja explain: caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/static/impl.cpp.o is dirty ninja explain: output caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/mps/MPSFallback.mm.o older than most recent input /Users/malfet/git/pytorch/pytorch/build/caffe2/core/macros.h (1732301547538843492 vs 1732301802196214000) ninja explain: caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/mps/MPSFallback.mm.o is dirty ``` Differential Revision: [D66468257](https://our.internmc.facebook.com/intern/diff/D66468257) Pull Request resolved: https://github.com/pytorch/pytorch/pull/141313 Approved by: https://github.com/ezyang	2024-11-26 00:09:36 +00:00
cyyever	c638a40a93	[Caffe2] Remove unused AVX512 code (#133160 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/133160 Approved by: https://github.com/albanD	2024-08-23 23:16:16 +00:00
PyTorch MergeBot	fa1d7b0262	Revert "Remove unused Caffe2 macros (#132979 )" This reverts commit `da65cfbdea`. Reverted https://github.com/pytorch/pytorch/pull/132979 on behalf of https://github.com/ezyang due to these are apparently load bearing internally ([comment](https://github.com/pytorch/pytorch/pull/132979#issuecomment-2284666332))	2024-08-12 18:34:56 +00:00
cyy	da65cfbdea	Remove unused Caffe2 macros (#132979 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/132979 Approved by: https://github.com/ezyang	2024-08-09 04:48:20 +00:00
cyy	574ae9afb8	[Submodule] Remove third-party onnx-tensorrt (#126542 ) It seems that tensorrt is not used by the C++ code, may be due to the removal of Caffe2. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126542 Approved by: https://github.com/ezyang	2024-05-19 22:34:24 +00:00
Nikita Shulga	7ca6e0d38f	[EZ] Add `CUSPARSELT` to build variables (#116213 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/116213 Approved by: https://github.com/Skylion007, https://github.com/kit1980, https://github.com/atalman ghstack dependencies: #116212	2023-12-21 01:02:11 +00:00
Nikita Shulga	74119a3482	[EZ] Fix typo in `USE_GLOO` var (#116212 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/116212 Approved by: https://github.com/Skylion007, https://github.com/kit1980	2023-12-21 01:02:11 +00:00
hongxyan	66a76516bf	[ROCm] Disabling Kernel Asserts for ROCm by default - fix and clean up and refactoring (#114660 ) Related to #103973 #110532 #108404 #94891 Context: As commented in `6ae0554d11/cmake/Dependencies.cmake (L1198)` Kernel asserts are enabled by default for CUDA and disabled for ROCm. However it is somewhat broken, and Kernel assert was still enabled for ROCm. Disabling kernel assert is also needed for users who do not have PCIe atomics support. These community users have verified that disabling the kernel assert in PyTorch/ROCm platform fixed their pytorch workflow, like torch.sum script, stable-diffusion. (see the related issues) Changes: This pull request serves the following purposes: * Refactor and clean up the logic, make it simpler for ROCm to enable and disable Kernel Asserts * Fix the bug that Kernel Asserts for ROCm was not disabled by default. Specifically, - Renamed `TORCH_DISABLE_GPU_ASSERTS` to `C10_USE_ROCM_KERNEL_ASSERT` for the following reasons: (1) This variable only applies to ROCm. (2) The new name is more align with #define CUDA_KERNEL_ASSERT function. (3) With USE_ in front of the name, we can easily control it with environment variable to turn on and off this feature during build (e.g. `USE_ROCM_KERNEL_ASSERT=1 python setup.py develop` will enable kernel assert for ROCm build). - Get rid of the `ROCM_FORCE_ENABLE_GPU_ASSERTS' to simplify the logic and make it easier to understand and maintain - Added `#cmakedefine` to carry over the CMake variable to C++ Tests: (1) build with default mode and verify that USE_ROCM_KERNEL_ASSERT is OFF(0), and kernel assert is disabled: ``` python setup.py develop ``` Verify CMakeCache.txt has correct value. ``` /xxxx/pytorch/build$ grep USE_ROCM_KERNEL_ASSERT CMakeCache.txt USE_ROCM_KERNEL_ASSERT:BOOL=0 ``` Tested the following code in ROCm build and CUDA build, and expected the return code differently. ``` subprocess.call([sys.executable, '-c', "import torch;torch._assert_async(torch.tensor(0,device='cuda'));torch.cuda.synchronize()"]) ``` This piece of code is adapted from below unit test to get around the limitation that this unit test now was skipped for ROCm. (We will check to enable this unit test in the future) ``` python test/test_cuda_expandable_segments.py -k test_fixed_cuda_assert_async ``` Ran the following script, expecting r ==0 since the CUDA_KERNEL_ASSERT is defined as nothing: ``` >> import sys >>> import subprocess >>> r=subprocess.call([sys.executable, '-c', "import torch;torch._assert_async(torch.tensor(0,device='cuda'));torch.cuda.synchronize()"]) >>> r 0 ``` (2) Enable the kernel assert by building with USE_ROCM_KERNEL_ASSERT=1, or USE_ROCM_KERNEL_ASSERT=ON ``` USE_ROCM_KERNEL_ASSERT=1 python setup.py develop ``` Verify `USE_ROCM_KERNEL_ASSERT` is `1` ``` /xxxx/pytorch/build$ grep USE_ROCM_KERNEL_ASSERT CMakeCache.txt USE_ROCM_KERNEL_ASSERT:BOOL=1 ``` Run the assert test, and expected return code not equal to 0. ``` >> import sys >>> import subprocess >>> r=subprocess.call([sys.executable, '-c', "import torch;torch._assert_async(torch.tensor(0,device='cuda'));torch.cuda.synchronize()"]) >>>/xxxx/pytorch/aten/src/ATen/native/hip/TensorCompare.hip:108: _assert_async_cuda_kernel: Device-side assertion `input[0] != 0' failed. :0:rocdevice.cpp :2690: 2435301199202 us: [pid:206019 tid:0x7f6cf0a77700] Callback: Queue 0x7f64e8400000 aborting with error : HSA_STATUS_ERROR_EXCEPTION: An HSAIL operation resulted in a hardware exception. code: 0x1016 >>> r -6 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/114660 Approved by: https://github.com/jeffdaily, https://github.com/malfet, https://github.com/jithunnair-amd	2023-12-13 15:44:53 +00:00
mikey dagitses	5f5d675587	remove unused CAFFE2_VERSION macros (#97337 ) remove unused CAFFE2_VERSION macros Summary: Nothing reads these and they are completely subsumed by TORCH_VERSION. Getting rid of these will be helpful for build unification, since they are also not used internally. Test Plan: Rely on CI. Reviewers: sahanp Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/97337 Approved by: https://github.com/malfet	2023-03-24 16:02:35 +00:00
Pruthvi Madugundu	fbd08fb358	Introduce TORCH_DISABLE_GPU_ASSERTS (#84190 ) - Asserts for CUDA are enabled by default - Disabled for ROCm by default by setting `TORCH_DISABLE_GPU_ASSERTS` to `ON` - Can be enabled for ROCm by setting above variable to`OFF` during build or can be forcefully enabled by setting `ROCM_FORCE_ENABLE_GPU_ASSERTS:BOOL=ON` This is follow up changes as per comment in PR #81790, comment [link](https://github.com/pytorch/pytorch/pull/81790#issuecomment-1215929021) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84190 Approved by: https://github.com/jeffdaily, https://github.com/malfet	2022-11-04 04:43:05 +00:00
PyTorch MergeBot	0fa23663cc	Revert "Introduce TORCH_DISABLE_GPU_ASSERTS (#84190 )" This reverts commit `1e2c4a6e0e`. Reverted https://github.com/pytorch/pytorch/pull/84190 on behalf of https://github.com/malfet due to Needs internal changes, has to be landed via co-dev	2022-11-02 18:13:37 +00:00
Pruthvi Madugundu	1e2c4a6e0e	Introduce TORCH_DISABLE_GPU_ASSERTS (#84190 ) - Asserts for CUDA are enabled by default - Disabled for ROCm by default by setting `TORCH_DISABLE_GPU_ASSERTS` to `ON` - Can be enabled for ROCm by setting above variable to`OFF` during build or can be forcefully enabled by setting `ROCM_FORCE_ENABLE_GPU_ASSERTS:BOOL=ON` This is follow up changes as per comment in PR #81790, comment [link](https://github.com/pytorch/pytorch/pull/81790#issuecomment-1215929021) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84190 Approved by: https://github.com/jeffdaily, https://github.com/malfet	2022-11-02 17:41:57 +00:00
zhang, xiaobing	86b86202b5	fix torch.config can't respect USE_MKLDNN flag issue (#75001 ) Fixes https://github.com/pytorch/pytorch/issues/74949, which reports that torch.config can't respect USE_MKLDNN flag. Pull Request resolved: https://github.com/pytorch/pytorch/pull/75001 Approved by: https://github.com/malfet	2022-07-17 15:00:48 +00:00
Jing Xu	3c7044728b	Enable Intel® VTune™ Profiler's Instrumentation and Tracing Technology APIs (ITT) to PyTorch (#63289 ) More detailed description of benefits can be found at #41001. This is Intel's counterpart of NVidia’s NVTX (https://pytorch.org/docs/stable/autograd.html#torch.autograd.profiler.emit_nvtx). ITT is a functionality for labeling trace data during application execution across different Intel tools. For integrating Intel(R) VTune Profiler into Kineto, ITT needs to be integrated into PyTorch first. It works with both standalone VTune Profiler [(https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html](https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html)) and Kineto-integrated VTune functionality in the future. It works for both Intel CPU and Intel XPU devices. Pitch Add VTune Profiler's ITT API function calls to annotate PyTorch ops, as well as developer customized code scopes on CPU, like NVTX for NVidia GPU. This PR rebases the code changes at https://github.com/pytorch/pytorch/pull/61335 to the latest master branch. Usage example: ``` with torch.autograd.profiler.emit_itt(): for i in range(10): torch.itt.range_push('step_{}'.format(i)) model(input) torch.itt.range_pop() ``` cc @ilia-cher @robieta @chaekit @gdankel @bitfort @ngimel @orionr @nbcsm @guotuofeng @guyang3532 @gaoteng-git Pull Request resolved: https://github.com/pytorch/pytorch/pull/63289 Approved by: https://github.com/malfet	2022-07-13 13:50:15 +00:00
PyTorch MergeBot	1454515253	Revert "Enable Intel® VTune™ Profiler's Instrumentation and Tracing Technology APIs (ITT) to PyTorch (#63289 )" This reverts commit `f988aa2b3f`. Reverted https://github.com/pytorch/pytorch/pull/63289 on behalf of https://github.com/malfet due to broke trunk, see `f988aa2b3f`	2022-06-30 12:49:41 +00:00
Jing Xu	f988aa2b3f	Enable Intel® VTune™ Profiler's Instrumentation and Tracing Technology APIs (ITT) to PyTorch (#63289 ) More detailed description of benefits can be found at #41001. This is Intel's counterpart of NVidia’s NVTX (https://pytorch.org/docs/stable/autograd.html#torch.autograd.profiler.emit_nvtx). ITT is a functionality for labeling trace data during application execution across different Intel tools. For integrating Intel(R) VTune Profiler into Kineto, ITT needs to be integrated into PyTorch first. It works with both standalone VTune Profiler [(https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html](https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html)) and Kineto-integrated VTune functionality in the future. It works for both Intel CPU and Intel XPU devices. Pitch Add VTune Profiler's ITT API function calls to annotate PyTorch ops, as well as developer customized code scopes on CPU, like NVTX for NVidia GPU. This PR rebases the code changes at https://github.com/pytorch/pytorch/pull/61335 to the latest master branch. Usage example: ``` with torch.autograd.profiler.emit_itt(): for i in range(10): torch.itt.range_push('step_{}'.format(i)) model(input) torch.itt.range_pop() ``` cc @ilia-cher @robieta @chaekit @gdankel @bitfort @ngimel @orionr @nbcsm @guotuofeng @guyang3532 @gaoteng-git Pull Request resolved: https://github.com/pytorch/pytorch/pull/63289 Approved by: https://github.com/malfet	2022-06-30 05:14:03 +00:00
Pruthvi Madugundu	085e2f7bdd	[ROCm] Changes not to rely on CUDA_VERSION or HIP_VERSION (#65610 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65610 - Replace HIP_PLATFORM_HCC with USE_ROCM - Dont rely on CUDA_VERSION or HIP_VERSION and use USE_ROCM and ROCM_VERSION. - In the next PR - Will be removing the mapping from CUDA_VERSION to HIP_VERSION and CUDA to HIP in hipify. - HIP_PLATFORM_HCC is deprecated, so will add HIP_PLATFORM_AMD to support HIP host code compilation on gcc. cc jeffdaily sunway513 jithunnair-amd ROCmSupport amathews-amd Reviewed By: jbschlosser Differential Revision: D30909053 Pulled By: ezyang fbshipit-source-id: 224a966ebf1aaec79beccbbd686fdf3d49267e06	2021-09-29 09:55:43 -07:00
Ailing Zhang	621198978a	Move USE_NUMPY to more appropriate targets (#51143 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51143 Test Plan: CI Reviewed By: wconstab Differential Revision: D26084123 fbshipit-source-id: af4abe4ef87c1ebe5434938320526a925f5c34c8	2021-01-27 15:44:12 -08:00
Rong Rong	54022e4f9b	add new build settings to torch.__config__ (#48380 ) Summary: many newly added build settings are not saved in torch.__config__. adding them to the mix. Pull Request resolved: https://github.com/pytorch/pytorch/pull/48380 Reviewed By: samestep Differential Revision: D25161951 Pulled By: walterddr fbshipit-source-id: 1d3dee033c93f2d1a7e2a6bcaf88aedafeac8d31	2020-12-01 14:16:36 -08:00
Jiakai Liu	3a0e35c9f2	[pytorch] deprecate static dispatch (#43564 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43564 Static dispatch was originally introduced for mobile selective build. Since we have added selective build support for dynamic dispatch and tested it in FB production for months, we can deprecate static dispatch to reduce the complexity of the codebase. Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D23324452 Pulled By: ljk53 fbshipit-source-id: d2970257616a8c6337f90249076fca1ae93090c7	2020-08-27 14:52:48 -07:00
Hong Xu	daf00beaba	Remove duplicated Numa detection code. (#30628 ) Summary: cmake/Dependencies.cmake (`1111a6b810/cmake/Dependencies.cmake (L595-L609)`) has already detected Numa. Duplicated detection and variables may lead to incorrect results. Close https://github.com/pytorch/pytorch/issues/29968 Pull Request resolved: https://github.com/pytorch/pytorch/pull/30628 Differential Revision: D18782479 Pulled By: ezyang fbshipit-source-id: f74441f03367f11af8fa59b92d656c6fa070fbd0	2020-01-03 08:48:46 -08:00
Richard Zou	9047d4df45	Remove all remaining usages of BUILD_NAMEDTENSOR (#31116 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31116 Changelist: - remove BUILD_NAMEDTENSOR macro - remove torch._C._BUILD_NAMEDTENSOR - remove all python behavior that relies on torch._C._BUILD_NAMEDTENSOR Future: - In the next diff, I will remove all usages of ATen/core/EnableNamedTensor.h since that header doesn't do anything anymore - After that, we'll be done with the BUILD_NAMEDTENSOR removal. Test Plan: - run CI Differential Revision: D18934951 Pulled By: zou3519 fbshipit-source-id: 0a0df0f1f0470d0a01c495579333a2835aac9f5d	2019-12-12 09:53:03 -08:00
Lucian Grijincu	9c9f14029d	Revert D16929363: Revert D16048264: Add static dispatch mode to reduce mobile code size Differential Revision: D16929363 Original commit changeset: 69d302929e18 fbshipit-source-id: add36a6047e4574788eb127c40f6166edeca705f	2019-08-20 17:08:31 -07:00
Lucian Grijincu	bd6cf5099b	Revert D16048264: Add static dispatch mode to reduce mobile code size Differential Revision: D16048264 Original commit changeset: ad1e50951273 fbshipit-source-id: 69d302929e183e2da26b64dcc24c69c3b7de186b	2019-08-20 16:26:18 -07:00
Roy Li	6824c9018d	Add static dispatch mode to reduce mobile code size Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22335 Test Plan: Imported from OSS Differential Revision: D16048264 Pulled By: li-roy fbshipit-source-id: ad1e50951273962a51bac7c25c3d2e5a588a730e	2019-08-20 12:21:32 -07:00
Hong Xu	693871ded3	Rename macros and build options NAMEDTENSOR_ENABLED to BUILD_NAMEDTENSOR (#22360 ) Summary: Currently the build system accepts USE_NAMEDTENSOR from the environment variable and turns it into NAMEDTENSOR_ENABLED when passing to CMake. This discrepancy does not seem necessary and complicates the build system. The naming of this build option is also semantically incorrect ("BUILD_" vis-a-vis "USE_"). This commit eradicate this issue before it is made into a stable release. The support of NO_NAMEDTENSOR is also removed, since PyTorch has been quite inconsistent about "NO_*" build options. --- Note: All environment variables with their names starting with `BUILD_` are currently automatically passed to CMake with no need of an additional wrapper. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22360 Differential Revision: D16074509 Pulled By: zou3519 fbshipit-source-id: dc316287e26192118f3c99b945454bc50535b2ae	2019-07-02 11:46:13 -07:00
Richard Zou	e01a5bf28b	Add USE_NAMEDTENSOR compilation flag. (#20162 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20162 ghimport-source-id: 0efcd67f04aa087e1dd5faeee550daa2f13ef1a5 Reviewed By: gchanan Differential Revision: D15278211 Pulled By: zou3519 fbshipit-source-id: 6fee981915d83e820fe8b50a8f59da22a428a9bf	2019-05-09 09:09:16 -07:00
Jerry Zhang	0c32e1b43e	use C10_MOBILE/ANDROID/IOS (#15363 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15363 Didn't define C10_MOBILE in the numa file move diff: D13380559 move CAFFE2_MOBILE/ANDROID/IOS to c10 ``` codemod -m -d caffe2 --extensions h,hpp,cc,cpp,mm "CAFFE2_MOBILE" "C10_MOBILE" codemod -m -d caffe2 --extensions h,hpp,cc,cpp,mm "CAFFE2_ANDROID" "C10_ANDROID" codemod -m -d caffe2 --extensions h,hpp,cc,cpp,mm "CAFFE2_IOS" "C10_IOS" ``` i-am-not-moving-c2-to-c10 Reviewed By: marcinkwiatkowski Differential Revision: D13490020 fbshipit-source-id: c4f01cacbefc0f16d5de94155c26c92fd5d780e4	2019-01-09 15:08:20 -08:00
Jerry Zhang	63e77ab6c4	Move numa.{h, cc} to c10/util (#15024 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15024 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14393 att Reviewed By: dzhulgakov Differential Revision: D13380559 fbshipit-source-id: abc3fc7321cf37323f756dfd614c7b41978734e4	2018-12-12 12:21:10 -08:00
Yudong Guang	265b55d028	Revert D13205604: Move numa.{h, cc} to c10/util Differential Revision: D13205604 Original commit changeset: 54166492d318 fbshipit-source-id: 89b6833518c0b554668c88ae38d97fbc47e2de17	2018-12-07 10:01:25 -08:00
Jerry Zhang	1d111853ae	Move numa.{h, cc} to c10/util (#14393 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14393 att Reviewed By: ezyang Differential Revision: D13205604 fbshipit-source-id: 54166492d31827b0343ed070cc36a825dd86e2ed	2018-12-06 11:30:13 -08:00
Jongsoo Park	b5181ba1df	add avx512 option (but no avx512 kernel yet) (#14664 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14664 This diff just adds a framework to add avx512 kernels. Please be really really careful about using avx512 kernels unless you're convinced using avx512 will bring good enough overall speedups because it can backfire because of cpu frequency going down. Reviewed By: duc0 Differential Revision: D13281944 fbshipit-source-id: 04fce8619c63f814944b727a99fbd7d35538eac6	2018-12-03 12:18:19 -08:00
Gu, Jinghui	dbab9b73b6	seperate mkl, mklml, and mkldnn (#12170 ) Summary: 1. Remove avx2 support in mkldnn 2. Seperate mkl, mklml, and mkldnn 3. Fix convfusion test case Pull Request resolved: https://github.com/pytorch/pytorch/pull/12170 Reviewed By: yinghai Differential Revision: D10207126 Pulled By: orionr fbshipit-source-id: 1e62eb47943f426a89d57e2d2606439f2b04fd51	2018-10-29 10:52:55 -07:00
vishwakftw	39bd73ae51	Guard NumPy usage using USE_NUMPY (#11798 ) Summary: All usages of the `ndarray` construct have now been guarded with `USE_NUMPY`. This eliminates the requirement of NumPy while building PyTorch from source. Fixes #11757 Reviewed By: Yangqing Differential Revision: D10031862 Pulled By: SsnL fbshipit-source-id: 32d84fd770a7714d544e2ca1895a3d7c75b3d712	2018-10-04 12:11:02 -07:00
Yangqing Jia	1962646d0f	Remove CAFFE2_UNIQUE_LONG_TYPEMETA (#12311 ) Summary: CAFFE2_UNIQUE_LONG_TYPEMETA has been a tricky variable defined only from cmake - this is an experiment to remove it and see what exact compilers need that one set. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12311 Reviewed By: dzhulgakov Differential Revision: D10187777 Pulled By: Yangqing fbshipit-source-id: 03e4ede4eafc291e947e0449382bc557cb624b34	2018-10-04 10:12:13 -07:00
Yangqing Jia	38f3d1fc40	move flags to c10 (#12144 ) Summary: still influx. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12144 Reviewed By: smessmer Differential Revision: D10140176 Pulled By: Yangqing fbshipit-source-id: 1a313abed022039333e3925d19f8b3ef2d95306c	2018-10-04 02:09:56 -07:00
Edward Yang	fcb3ccf23f	Don't record Git version automatically via cmake (#12046 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12046 This /sounds/ like a good idea in theory, but a feature like this must be implemented very carefully, because if you just plop the Git version in a header (that is included by every file in your project, as macros.h is), then every time you do a 'git pull', you will do a FULL rebuild, because macros.h is going to regenerate to a new version and of course you have to rebuild a source file if a header file changes. I don't have time to implement it correctly, so I'm axing the feature instead. If you want git versions in, e.g., nightly builds, please explicitly specify that when you feed in the version. Reviewed By: pjh5 Differential Revision: D10030556 fbshipit-source-id: 499d001c7b8ccd4ef15ce10dd6591c300c7df27d	2018-09-25 09:40:19 -07:00
Yangqing Jia	20c516ac18	[cmake] Make cudnn optional (#8265 ) * Make cudnn optional * Remove cudnn file from cpu file	2018-06-08 02:04:27 -07:00
Orion Reblitz-Richardson	24b41da795	[build] Make ATen buildable without all Caffe2 by root cmake (#7295 ) * Make ATen buildable without all Caffe2 by root cmake * Fix typo in aten cmake * Set BUILD_ATEN from USE_ATEN as compat * Only set BUILD_ATEN from USE_ATEN when on * Have USE_GLOO only set when BUILD_CAFFE2	2018-05-08 10:24:04 -07:00
Jinghui	26ddefbda1	[feature request] [Caffe2] Enable MKLDNN support for inference (#6699 ) * Add operators based-on IDEEP interfaces Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Enable IDEEP as a caffe2 device Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Add test cases for IDEEP ops Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Add IDEEP as a caffe2 submodule Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Skip test cases if no IDEEP support Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Correct cmake options for IDEEP Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Add dependences on ideep libraries Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Fix issues in IDEEP conv ops and etc. Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Move ideep from caffe2/ideep to caffe2/contrib/ideep Signed-off-by: Gu Jinghui <jinghui.gu@intel.com> * Update IDEEP to fix cmake issue Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Fix cmake issue caused by USE_MKL option Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Correct comments in MKL cmake file Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>	2018-04-22 21:58:14 -07:00
Yinghai Lu	434f710f3f	[Caffe2] Add support to TensorRT (#6150 ) * Add support to TensorRT * Removed License header * Bind input/output by position * Comments * More comments * Add benchmark * Add warning for performance degradation on large batch * Address comments * comments	2018-04-11 17:03:54 -07:00
Dmytro Dzhulgakov	08bb6ae8bb	Fix OSS build Fix OSS build broken after D6946982 by adding CMake detection variable (https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-gcc4.9-ubuntu14.04-build/1343/console)	2018-03-06 00:33:11 -08:00
Kutta Srinivasan	72f259c84b	Add C++ preprocessor define CAFFE2_USE_EXCEPTION_PTR to guard use of std::exception_ptr	2018-03-05 16:02:40 -08:00
Yangqing Jia	545c0937fb	Making a module option for Caffe2 Summary: This will help releasing models that are using Caffe2 but have their own operator implementations and extensions. More detailed docs to arrive later. Let's see what contbuild says. Closes https://github.com/caffe2/caffe2/pull/1378 Differential Revision: D6155045 Pulled By: Yangqing fbshipit-source-id: 657a4c8de2f8e095bad5ed5db5b3e476b2a877e1	2017-10-26 12:33:58 -07:00
Yangqing Jia	f0ca857e6b	Explicitly use Eigen MPL2 in builds. Summary: Closes #1371 Closes https://github.com/caffe2/caffe2/pull/1372 Reviewed By: asaadaldien, houseroad, akyrola, bwasti Differential Revision: D6125792 Pulled By: Yangqing fbshipit-source-id: 5fd7ee9a5d77381fe9afbe899ef18465ecd1ceea	2017-10-23 15:06:38 -07:00
Dmytro Dzhulgakov	5527dd3b08	Expose CMake options in the binary Summary: Useful for figuring out with people which version they built with. We can just ask for --caffe2_version gflag or get core.build_options from python. Also adds CMAKE_INSTALL_RPATH_USE_LINK_PATH - without it wasn't building on my Mac. How should it be tested? Closes https://github.com/caffe2/caffe2/pull/1271 Reviewed By: bddppq Differential Revision: D5940750 Pulled By: dzhulgakov fbshipit-source-id: 45b4c94f67e79346a10a65b34f40fd258295dad1	2017-10-04 02:33:02 -07:00
Yangqing Jia	f14d75c7ef	Proper versioning and misc CMake improvements Summary: This brings proper versioning in Caffe2: instead of manual version macros, this puts the version information in CMake (replacing the TODO bwasti line) and uses macros.h.in to then generate the version in the C++ header. A few misc updates: - Removed the mac os rpath, verified on local macbook that it is no longer needed. - Misc updates for caffe2 ready: - Mapped cmake/Cuda.cmake with gloo's setting. - upstreamed third_party/nccl so it builds with cuda 9. - Separated the Caffe2 cpu dependencies and cuda dependencies - now libCaffe2_CPU.so do not depend on any cuda libs. - caffe2 python extensions now depend on cpu and gpu separately too. - Reduced the number of unused functions in Utils.cmake Closes https://github.com/caffe2/caffe2/pull/1256 Reviewed By: dzhulgakov Differential Revision: D5899210 Pulled By: Yangqing fbshipit-source-id: 36366e47366c3258374d646cf410b5f49f95767b	2017-09-26 08:52:21 -07:00
Luke Yeager	c1356216a2	cmake: generate macros.h with configure_file() Summary: Using file(WRITE) caused the file to be rewritten for every CMake reconfigure, which was causing unnecessary full rebuilds of the project even when no source files changed. The new strategy has the added benefit of enforcing that the macros.h file is always generated correctly. When the main project relies on this header for macro definitions (instead of relying on add_definitions()), we can be more confident that the project will build correctly when used as a library (which is the whole point of the macros.h file). Upsides: * No more unnecessary rebuilds * Higher confidence that the project will compile properly as a third-party library Downsides: * Developers need to add an entry to `macros.h.in` whenever they would have added a new definition with `add_definitions()` Closes https://github.com/caffe2/caffe2/pull/1103 Differential Revision: D5680367 Pulled By: Yangqing fbshipit-source-id: 4db29c28589efda1b6a3f5f88752e3984260a0f2	2017-08-22 14:22:36 -07:00

49 Commits