pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Samuel Marks	e6779d4357	[*.py] Rename "Arguments:" to "Args:" (#49736 ) Summary: I've written custom parsers and emitters for everything from docstrings to classes and functions. However, I recently came across an issue when I was parsing/generating from the TensorFlow codebase: inconsistent use of `Args:` and `Arguments:` in its docstrings. ```sh (pytorch#c348fae)$ for name in 'Args:' 'Arguments:'; do printf '%-10s %04d\n' "$name" "$(rg -IFtpy --count-matches "$name" \| paste -s -d+ -- \| bc)"; done Args: 1095 Arguments: 0336 ``` It is easy enough to extend my parsers to support both variants, however it looks like `Arguments:` is wrong anyway, as per: - https://google.github.io/styleguide/pyguide.html#doc-function-args @ [`ddccc0f`](https://github.com/google/styleguide/blob/ddccc0f/pyguide.md) - https://chromium.googlesource.com/chromiumos/docs/+/master/styleguide/python.md#describing-arguments-in-docstrings @ [`9fc0fc0`](https://chromium.googlesource.com/chromiumos/docs/+/9fc0fc0/styleguide/python.md) - https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html @ [`c0ae8e3`](https://github.com/sphinx-contrib/napoleon/blob/c0ae8e3/docs/source/example_google.rst) Therefore, only `Args:` is valid. This PR replaces them throughout the codebase. PS: For related PRs, see tensorflow/tensorflow/pull/45420 PPS: The trackbacks automatically appearing below are sending the same changes to other repositories in the [PyTorch](https://github.com/pytorch) organisation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49736 Reviewed By: albanD Differential Revision: D25710534 Pulled By: soumith fbshipit-source-id: 61e8ff01abb433e9f78185c2d1d0cbd7c22c1619	2020-12-28 09:34:47 -08:00
Nikita Shulga	2dff0b3e91	Fix typos in comments (#48316 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48316 Reviewed By: walterddr, mrshenli Differential Revision: D25125123 Pulled By: malfet fbshipit-source-id: 6f31e5456cc078cc61b288191f1933711acebba0	2020-11-24 10:56:40 -08:00
Abaho Katabarwa	de3a48013a	Use CAFFE2_USE_MSVC_STATIC_RUNTIME to determine when to avoid waiting for global destructors on Windows (#43532 ) Summary: We are trying to build libtorch statically (BUILD_SHARED_LIBS=OFF) then link it into a DLL. Our setup hits the infinite loop mentioned [here](`54c05fa34e/torch/csrc/autograd/engine.cpp (L228)`) because we build with `BUILD_SHARED_LIBS=OFF` but still link it all into a DLL at the end of the day. This PR fixes the issue by changing the condition to guard on which windows runtime the build links against using the `CAFFE2_USE_MSVC_STATIC_RUNTIME` flag. `CAFFE2_USE_MSVC_STATIC_RUNTIME` defaults to ON when `BUILD_SHARED_LIBS=OFF`, so backwards compatibility is maintained. I'm not entirely confident I understand the subtleties of the windows runtime versus linking setup, but this setup works for us and should not affect the existing builds. Fixes https://github.com/pytorch/pytorch/issues/44470 Pull Request resolved: https://github.com/pytorch/pytorch/pull/43532 Reviewed By: mrshenli Differential Revision: D24053767 Pulled By: albanD fbshipit-source-id: 1127fefe5104d302a4fc083106d4e9f48e50add8	2020-10-01 16:41:14 -07:00
Bugra Akyildiz	27c7158166	Remove __future__ imports for legacy Python2 supports (#45033 ) Summary: There is a module called `2to3` which you can target for future specifically to remove these, the directory of `caffe2` has the most redundant imports: ```2to3 -f future -w caffe2``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/45033 Reviewed By: seemethere Differential Revision: D23808648 Pulled By: bugra fbshipit-source-id: 38971900f0fe43ab44a9168e57f2307580d36a38	2020-09-23 17:57:02 -07:00
guol-fnst	e1afa9daff	fix cmake bug (#39930 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39930 Differential Revision: D22391207 Pulled By: ezyang fbshipit-source-id: bde19a112846e124d4e5316ba947f48d4dccf361	2020-07-06 08:02:30 -07:00
peter	bfa5070cbc	Fix rebuild with Ninja on Windows (#37917 ) Summary: It is currently broken due to a ninja bug. Pull Request resolved: https://github.com/pytorch/pytorch/pull/37917 Differential Revision: D21470357 Pulled By: ezyang fbshipit-source-id: c0ed858c63a7504bf2c4961dd7ed906fc3f4502a	2020-05-07 19:15:27 -07:00
Wojciech Baranowski	945672bf3e	cmake: improve dependencies in incremental builds (#37661 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/26304 Test procedure: With ninja: [x] Build a clean checkout [x] Build again. Result: Only 10 libraries are (needlessly) linked again, the extra delay on a 24-core machine is <10s. [x] Build for the third time. Result: Virtually instantaneous, with no extra rebuilding. [x] Modify DispatchTable.h. Build again. Result: `.cu` files are rebuilt, as well as many `.cpp` files [x] Build for the fifth time. Result: Virtually instantaneous, with no extra rebuilding. [x] Touch one of the `.depend` files. Build again. Result: Only 10 libraries are (needlessly) linked again, the extra delay on a 24-core machine is <10s. Without ninja: [x] Build a clean checkout [x] Build again. Result: There is some unnecessary rebuilding. But it was also happening before this change. [x] Build for the third time. Result: Virtually instantaneous, with no extra rebuilding. Pull Request resolved: https://github.com/pytorch/pytorch/pull/37661 Differential Revision: D21434624 Pulled By: ezyang fbshipit-source-id: 379d2315486b8bb5972c184f9b8da8e00d38c338	2020-05-06 14:25:18 -07:00
Nikita Shulga	e2adcc1c53	Report CUDA separate compilation flag (#35726 ) Summary: In Summary specify whether CUDA code is compiled with separate compilation enabled Also, correctly handle space-separate TORCH_NVCC_FLAGS when adding them to NVCC_CUDA_FLAGS Pull Request resolved: https://github.com/pytorch/pytorch/pull/35726 Test Plan: CI + local build with TORCH_NVCC_FLAGS set to "-Xfatbin -compress-all" Differential Revision: D20830885 Pulled By: malfet fbshipit-source-id: 0e0ecab4a97b6c8662a2c4bfc817857da9f32201	2020-04-02 19:35:02 -07:00
pinzhenx	bd604cb5b7	Upgrade MKL-DNN to DNNL v1.2 (#32422 ) Summary: ## Motivation This PR upgrades MKL-DNN from v0.20 to DNNL v1.2 and resolves https://github.com/pytorch/pytorch/issues/30300. DNNL (Deep Neural Network Library) is the new brand of MKL-DNN, which improves performance, quality, and usability over the old version. This PR focuses on the migration of all existing functionalities, including minor fixes, performance improvement and code clean up. It serves as the cornerstone of our future efforts to accommodate new features like OpenCL support, BF16 training, INT8 inference, etc. and to let the Pytorch community derive more benefits from the Intel Architecture. <br> ## What's included? Even DNNL has many breaking changes to the API, we managed to absorb most of them in ideep. This PR contains minimalist changes to the integration code in pytorch. Below is a summary of the changes: <br> General: 1. Replace op-level allocator with global-registered allocator ``` // before ideep::sum::compute<AllocForMKLDNN>(scales, {x, y}, z); // after ideep::sum::compute(scales, {x, y}, z); ``` The allocator is now being registeted at `aten/src/ATen/native/mkldnn/IDeepRegistration.cpp`. Thereafter all tensors derived from the `cpu_engine` (by default) will use the c10 allocator. ``` RegisterEngineAllocator cpu_alloc( ideep::engine::cpu_engine(), [](size_t size) { return c10::GetAllocator(c10::DeviceType::CPU)->raw_allocate(size); }, [](void* p) { c10::GetAllocator(c10::DeviceType::CPU)->raw_deallocate(p); } ); ``` ------ 2. Simplify group convolution We had such a scenario in convolution where ideep tensor shape mismatched aten tensor: when `groups > 1`, DNNL expects weights tensors to be 5-d with an extra group dimension, e.g. `goihw` instead of `oihw` in 2d conv case. As shown below, a lot of extra checks came with this difference in shape before. Now we've completely hidden this difference in ideep and all tensors are going to align with pytorch's definition. So we could safely remove these checks from both aten and c2 integration code. ``` // aten/src/ATen/native/mkldnn/Conv.cpp if (w.ndims() == x.ndims() + 1) { AT_ASSERTM( groups > 1, "Only group _mkldnn_conv2d weights could have been reordered to 5d"); kernel_size[0] = w.get_dim(0) * w.get_dim(1); std::copy_n( w.get_dims().cbegin() + 2, x.ndims() - 1, kernel_size.begin() + 1); } else { std::copy_n(w.get_dims().cbegin(), x.ndims(), kernel_size.begin()); } ``` ------ 3. Enable DNNL built-in cache Previously, we stored DNNL jitted kernels along with intermediate buffers inside ideep using an LRU cache. Now we are switching to the newly added DNNL built-in cache, and no longer caching buffers in order to reduce memory footprint. This change will be mainly reflected in lower memory usage from memory profiling results. On the code side, we removed couple of lines of `op_key_` that depended on the ideep cache before. ------ 4. Use 64-bit integer to denote dimensions We changed the type of `ideep::dims` from `vector<int32_t>` to `vector<int64_t>`. This renders ideep dims no longer compatible with 32-bit dims used by caffe2. So we use something like `{stride_.begin(), stride_.end()}` to cast parameter `stride_` into a int64 vector. <br> Misc changes in each commit: Commit: change build options Some build options were slightly changed, mainly to avoid name collisions with other projects that include DNNL as a subproject. In addition, DNNL built-in cache is enabled by option `DNNL_ENABLE_PRIMITIVE_CACHE`. Old \| New -- \| -- WITH_EXAMPLE \| MKLDNN_BUILD_EXAMPLES WITH_TEST \| MKLDNN_BUILD_TESTS MKLDNN_THREADING \| MKLDNN_CPU_RUNTIME MKLDNN_USE_MKL \| N/A (not use MKL anymore) ------ Commit: aten reintegration - aten/src/ATen/native/mkldnn/BinaryOps.cpp Implement binary ops using new operation `binary` provided by DNNL - aten/src/ATen/native/mkldnn/Conv.cpp Clean up group convolution checks Simplify conv backward integration - aten/src/ATen/native/mkldnn/MKLDNNConversions.cpp Simplify prepacking convolution weights - test/test_mkldnn.py Fixed an issue in conv2d unit test: it didn't check conv results between mkldnn and aten implementation before. Instead, it compared the mkldnn with mkldnn as the default cpu path will also go into mkldnn. Now we use `torch.backends.mkldnn.flags` to fix this issue - torch/utils/mkldnn.py Prepack weight tensor on module `__init__` to achieve better performance significantly ------ Commit: caffe2 reintegration - caffe2/ideep/ideep_utils.h Clean up unused type definitions - caffe2/ideep/operators/adam_op.cc & caffe2/ideep/operators/momentum_sgd_op.cc Unify tensor initialization with `ideep::tensor::init`. Obsolete `ideep::tensor::reinit` - caffe2/ideep/operators/conv_op.cc & caffe2/ideep/operators/quantization/int8_conv_op.cc Clean up group convolution checks Revamp convolution API - caffe2/ideep/operators/conv_transpose_op.cc Clean up group convolution checks Clean up deconv workaround code ------ Commit: custom allocator - Register c10 allocator as mentioned above <br><br> ## Performance We tested inference on some common models based on user scenarios, and most performance numbers are either better than or on par with DNNL 0.20. ratio: new / old \| Latency (batch=1 4T) \| Throughput (batch=64 56T) -- \| -- \| -- pytorch resnet18 \| 121.4% \| 99.7% pytorch resnet50 \| 123.1% \| 106.9% pytorch resnext101_32x8d \| 116.3% \| 100.1% pytorch resnext50_32x4d \| 141.9% \| 104.4% pytorch mobilenet_v2 \| 163.0% \| 105.8% caffe2 alexnet \| 303.0% \| 99.2% caffe2 googlenet-v3 \| 101.1% \| 99.2% caffe2 inception-v1 \| 102.2% \| 101.7% caffe2 mobilenet-v1 \| 356.1% \| 253.7% caffe2 resnet101 \| 100.4% \| 99.8% caffe2 resnet152 \| 99.8% \| 99.8% caffe2 shufflenet \| 141.1% \| 69.0% † caffe2 squeezenet \| 98.5% \| 99.2% caffe2 vgg16 \| 136.8% \| 100.6% caffe2 googlenet-v3 int8 \| 100.0% \| 100.7% caffe2 mobilenet-v1 int8 \| 779.2% \| 943.0% caffe2 resnet50 int8 \| 99.5% \| 95.5% _Configuration: Platform: Skylake 8180 Latency Test: 4 threads, warmup 30, iteration 500, batch size 1 Throughput Test: 56 threads, warmup 30, iteration 200, batch size 64_ † Shufflenet is one of the few models that require temp buffers during inference. The performance degradation is an expected issue since we no longer cache any buffer in the ideep. As for the solution, we suggest users opt for caching allocator like jemalloc as a drop-in replacement for system allocator in such heavy workloads. Pull Request resolved: https://github.com/pytorch/pytorch/pull/32422 Test Plan: Perf results: https://our.intern.facebook.com/intern/fblearner/details/177790608?tab=Experiment%20Results 10% improvement for ResNext with avx512, neutral on avx2 More results: https://fb.quip.com/ob10AL0bCDXW#NNNACAUoHJP Reviewed By: yinghai Differential Revision: D20381325 Pulled By: dzhulgakov fbshipit-source-id: 803b906fd89ed8b723c5fcab55039efe3e4bcb77	2020-03-26 22:07:59 -07:00
cyy	5be8a4e027	find mkl installed by nuget (#34031 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34031 Differential Revision: D20221807 Pulled By: ezyang fbshipit-source-id: 827e2775956f408febb287676bbf9a96a70fe2d4	2020-03-03 07:44:20 -08:00
Hong Xu	f255b7a3ac	Drop support of the build option USE_GLOO_IBVERBS (#33163 ) Summary: Two releases have passed since its deprecation: `8a026d4f74` Pull Request resolved: https://github.com/pytorch/pytorch/pull/33163 Differential Revision: D19850713 Pulled By: ezyang fbshipit-source-id: 30a60df470b88e8c40e33112296e437cde29c49f	2020-02-11 20:35:50 -08:00
peterjc123	ebed008dd4	Correct /MP usage in MSVC (#33120 ) Summary: ## Several flags `/MP[M]`: It is a flag for the compiler `cl`. It leads to object-level multiprocessing. By default, it spawns M processes where M is the number of cores on the PC. `/maxcpucount:[M]`: It is a flag for the generator `msbuild`. It leads to project-level multiprocessing. By default, it spawns M processes where M is the number of cores on the PC. `/p:CL_MPCount=[M]`: It is a flag for the generator `msbuild`. It leads the generator to pass `/MP[M]` to the compiler. `/j[M]`: It is a flag for the generator `ninja`. It leads to object-level multiprocessing. By default, it spawns M processes where M is the number of cores on the PC. ## Reason for the change 1. Object-level multiprocessing is preferred over project-level multiprocessing. 2. ~For ninja, we don't need to set `/MP` otherwise M * M processes will be spawned.~ Actually, it is not correct because in ninja configs, there are only one source file in the command. Therefore, the `/MP` switch should be useless. 3. For msbuild, if it is called through Python configuration scripts, then `/p:CL_MPCount=[M]` will be added, otherwise, we add `/MP` to `CMAKE_CXX_FLAGS`. 4. ~It may be a possible fix for https://github.com/pytorch/pytorch/issues/28271, https://github.com/pytorch/pytorch/issues/27463 and https://github.com/pytorch/pytorch/issues/25393. Because `/MP` is also passed to `nvcc`.~ It is probably not true. Because `/MP` should not be effective given there is only one source file per command. ## Reference 1. https://docs.microsoft.com/en-us/cpp/build/reference/mp-build-with-multiple-processes?view=vs-2019 2. https://github.com/Microsoft/checkedc-clang/wiki/Parallel-builds-of-clang-on-Windows 3. https://blog.kitware.com/cmake-building-with-all-your-cores/ Pull Request resolved: https://github.com/pytorch/pytorch/pull/33120 Differential Revision: D19817227 Pulled By: ezyang fbshipit-source-id: f8d01f835016971729c7a8d8a0d1cb8a8c2c6a5f	2020-02-10 11:29:25 -08:00
cyy	27e1fecabd	let user specify CUDA_HOST_COMPILER Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32904 Differential Revision: D19729047 Pulled By: ezyang fbshipit-source-id: c233e3924f71a025c51d25a7e3a8d728dac8730a	2020-02-04 14:32:12 -08:00
peterjc123	9a5fd2eb07	Fix conflicts in CMAKE_GENERATOR and generator (#30971 ) Summary: ...specified in -G https://cmake.org/cmake/help/latest/variable/CMAKE_GENERATOR.html According to the document, the generator could be determined through two methods: 1. Specify in `-G` 2. Read from `CMAKE_GENERATOR` We should avoid conflicts in these two methods. This fixes https://github.com/pytorch/pytorch/issues/30910. Pull Request resolved: https://github.com/pytorch/pytorch/pull/30971 Differential Revision: D18927529 Pulled By: mingbowan fbshipit-source-id: e9a179ceb32d6fbabfaeac6cfe9e6170ca170b20	2019-12-10 22:22:26 -08:00
Hong Xu	21d7532dfe	Add more comment on NumPy detection in Python scripts. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30417 Differential Revision: D18716502 Pulled By: albanD fbshipit-source-id: 0b1b86f882e0e24cb6845e4a44708048e7e3b4a8	2019-11-26 17:38:27 -08:00
Hong Xu	3455231e9c	Expose configuration of Numa directories to setup.py (#30104 ) Summary: https://github.com/pytorch/pytorch/issues/29968 Pull Request resolved: https://github.com/pytorch/pytorch/pull/30104 Differential Revision: D18656882 Pulled By: ezyang fbshipit-source-id: f932a98674033f1a3184dc1c22faa6f8c2b50134	2019-11-22 07:07:39 -08:00
David Reiss	d22f61432d	Update fbjni and enable PyTorch JNI build Summary: - Add a "BUILD_JNI" option that enables building PyTorch JNI bindings and fbjni. This is off by default because it adds a dependency on jni.h. - Update to the latest fbjni so we can inhibit building its tests, because they depend on gtest. - Set JAVA_HOME and BUILD_JNI in Linux binary build configurations if we can find jni.h in Docker. Test Plan: - Built on dev server. - Verified that libpytorch_jni links after libtorch when both are built in a parallel build. Differential Revision: D18536828 fbshipit-source-id: 19cb3be8298d3619352d02bb9446ab802c27ec66	2019-11-15 13:59:44 -08:00
Hong Xu	ff9d508b88	Remove tools/setup_helpers/cuda.py. (#28617 ) Summary: Except for the Windows default path, everything it does has been done in FindCUDA.cmake. Search for nvcc in path has been added to FindCUDA.cmake (https://github.com/pytorch/pytorch/issues/29160). The Windows default path part is moved to build_pytorch_libs.py. CUDA_HOME is kept for now because other parts of the build system is still using it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/28617 Differential Revision: D18347814 Pulled By: ezyang fbshipit-source-id: 22bb7eccc17b559ce3efc1ca964e3fbb270b5b0f	2019-11-06 07:12:01 -08:00
Hong Xu	5e5cbceeba	remove tools/setup_helpers/cudnn.py (#25876 ) Summary: FindCUDNN.cmake and cuda.cmake have done the detection. This commit deletes `tools/setup_helpers/cudnn.py` as it is no longer needed. Previously in https://github.com/pytorch/pytorch/issues/25482, one test failed because TensorRT detects cuDNN differently, and there may be situations we can find cuDNN but TensorRT cannot. This is fixed by passing our detection result down to TensorRT. Pull Request resolved: https://github.com/pytorch/pytorch/pull/25876 Differential Revision: D17346270 Pulled By: ezyang fbshipit-source-id: c1e7ad4a1cb20f964fe07a72906f2f002425d894	2019-09-24 07:44:33 -07:00
Hong Xu	a96e41b7c0	Use expected_wrapper only if CMAKE_{C,CXX}_COMPILER and/or is not set by user (#26306 ) Summary: This will honor user's preference. Pull Request resolved: https://github.com/pytorch/pytorch/pull/26306 Differential Revision: D17408030 Pulled By: soumith fbshipit-source-id: 6841b805603d40cd7caf78dbb42405a0c931f052	2019-09-16 16:12:29 -07:00
Hong Xu	8a026d4f74	Remove tools/setup_helpers/dist_check.py (#25879 ) Summary: What dist_check.py does is largely merely determining whether we should use set "USE_IBVERBS" to ON or OFF when the user sets "USE_GLOO_IBVERBS" to ON. But this is unnecessary, because this complicated determination will always be overrided by gloo: `2101e02cea/cmake/Dependencies.cmake (L19-L28)` Since dist_check.py becomes irrelevant, this commit also simplifies the setting of `USE_DISTRIBUTED` (by removing its explicit setting in Python scripts), and deprecate `USE_GLOO_IBVERBS` in favor of `USE_IBVERBS`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/25879 Differential Revision: D17282395 Pulled By: pietern fbshipit-source-id: a10735f50728d89c3d81fd57bcd26764e7f84dd1	2019-09-10 04:33:28 -07:00
Edward Yang	97b432bdf0	Back out "[pytorch][PR] remove tools/setup_helpers/cudnn.py" Summary: Original commit changeset: abd9cd0244ca (Note: this ignores all push blocking failures!) Test Plan: none Reviewed By: nairbv Differential Revision: D17259003 fbshipit-source-id: d7e067eeb36192766c639bfcbc66f540ce8eb77e	2019-09-09 06:47:45 -07:00
Hong Xu	66ac6698f6	remove tools/setup_helpers/cudnn.py (#25482 ) Summary: FindCUDNN.cmake and cuda.cmake have done the detection. This commit deletes `tools/setup_helpers/cudnn.py` as it is no longer needed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/25482 Differential Revision: D17226408 Pulled By: ezyang fbshipit-source-id: abd9cd0244cabea1f5d9f93f828d632d77c8dd5e	2019-09-06 06:54:35 -07:00
Hong Xu	cc4211069e	Do not pass down USE_GLOO_IBVERBS to CMake (#25720 ) Summary: It doesn't seem to be used anywhere once down to CMake in this repo or any submodules Pull Request resolved: https://github.com/pytorch/pytorch/pull/25720 Differential Revision: D17225088 Pulled By: pietern fbshipit-source-id: a24b080e6346a203b345e2b834fe095e3b9aece0	2019-09-06 02:40:42 -07:00
Hong Xu	0b1fee0819	Remove escape_path in our build system. (#24044 ) Summary: Which was added in https://github.com/pytorch/pytorch/issues/16412. Also make some CUDNN_* CMake variables to be build options so as to avoid direct reading using `$ENV` from environment variables from CMake scripts. Pull Request resolved: https://github.com/pytorch/pytorch/pull/24044 Differential Revision: D16783426 Pulled By: ezyang fbshipit-source-id: cb196b0013418d172d0d36558995a437bd4a3986	2019-08-13 20:38:19 -07:00
Hong Xu	994f643d9a	Do not force USE_SYSTEM_EIGEN_INSTALL to be OFF in Python build scripts (#23990 ) Summary: Not sure whether `34c0043aae` still makes sense. `USE_SYSTEM_EIGEN_INSTALL` is OFF by default (as set in CMakeLists.txt). If a user wants to change this build option, I don't see any reason to force them to do it in `CMakeCache.txt`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23990 Differential Revision: D16732569 Pulled By: ezyang fbshipit-source-id: 4604b4a1d5857552ad02e76aee91641aea48801a	2019-08-09 08:33:48 -07:00
Hong Xu	e80b48390d	When matching a line in CMakeCache.txt, ensure A=B and "A"=B are matched (#23745 ) Summary: Currently when reading CMakeCache.txt, only `VAR:TYPE=VAL` can be matched. This works well for CMake-generated lines, but a user may add a line without specifying type (`VAR=VAL`), which is totally legitimate in the eyes of CMake. This improvements in regex ensure that `VAR:TYPE=VAL` is also matched. The situation of `"VAR":TYPE=VAL` is also corrected. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23745 Differential Revision: D16726514 Pulled By: ezyang fbshipit-source-id: 6c50150d58926563837cf77d156c24d644666ef0	2019-08-08 18:07:28 -07:00
Hong Xu	1a9334ea59	Hotpatch CXXFLAGS to be the same as CFLAGS if CXXFLAGS is not set. (#23568 ) Summary: This fixes build regression caused by https://github.com/pytorch/pytorch/issues/23528 because we used to let CXXFLAGS equal CFLAGS. cc suo Pull Request resolved: https://github.com/pytorch/pytorch/pull/23568 Differential Revision: D16568820 Pulled By: suo fbshipit-source-id: 64a0dc923c08ac1751224f42bc4ccdc707341762	2019-08-07 16:25:57 -07:00
Hong Xu	323aad6b20	No need to handle the dependency of INSTALL_TEST on BUILD_TEST in cmake.py (#23806 ) Summary: Simplifying https://github.com/pytorch/pytorch/issues/23793: The dependency relationship between {INSTALL,BUILD}_TEST is already properly handled in CMakeLists.txt. All we need to do is to pass down INSTALL_TEST. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23806 Differential Revision: D16691833 Pulled By: soumith fbshipit-source-id: 7607492b2d82db3f79b174373a92e2810a854a61	2019-08-07 11:34:31 -07:00
Soumith Chintala	7d9e69e62e	allow INSTALL_TEST to pass through from env to cmake (#23793 ) Summary: This allows `INSTALL_*` to pass through to cmake. Additional fix is that if `INSTALL_TEST` is specified, it wont use `BUILD_TEST` as the default value for `INSTALL_TEST` Pull Request resolved: https://github.com/pytorch/pytorch/pull/23793 Differential Revision: D16648668 Pulled By: soumith fbshipit-source-id: 52c2a0d8033bc556355b87a6731a577940de9859	2019-08-05 09:55:14 -07:00
Edward Yang	3cc7da3a7d	Revert D16561561: [pytorch][PR] Remove preprocessing of CFLAGS, CPPFLAGS, and LDFLAGS in Python scripts. Differential Revision: D16561561 Original commit changeset: 962a27a2b0a1 fbshipit-source-id: 82ed08e5599ddbb9ed96352ac4572aa73df65aac	2019-07-30 13:28:19 -07:00
Hong Xu	cfe9400996	Remove preprocessing of CFLAGS, CPPFLAGS, and LDFLAGS in Python scripts. (#23528 ) Summary: After https://github.com/pytorch/pytorch/issues/23455, there is no need of this preprocessing in Python scripts. They will be automatically processed in CMake (plus CPPFLAGS here probably meant to be CXXFLAGS). Reference: - https://cmake.org/cmake/help/v3.15/envvar/CFLAGS.html - https://cmake.org/cmake/help/v3.15/envvar/CXXFLAGS.html - https://cmake.org/cmake/help/v3.15/envvar/LDFLAGS.html Pull Request resolved: https://github.com/pytorch/pytorch/pull/23528 Differential Revision: D16561561 Pulled By: ezyang fbshipit-source-id: 962a27a2b0a18db0f95477ad067a2611e4128187	2019-07-30 08:07:36 -07:00
Hong Xu	8ada7c9920	Remove two CMAKE_ build options from additional_options. (#23451 ) Summary: Following up `915261c8be` Pull Request resolved: https://github.com/pytorch/pytorch/pull/23451 Differential Revision: D16542303 Pulled By: ezyang fbshipit-source-id: 1406c311c198eb237f85d6d8f1f0d58626be8257	2019-07-29 08:13:59 -07:00
Hong Xu	b335f3910f	Remove redundant MSVC_Z7_OVERRIDE processing and combine "/EHa" flag setup (#23455 ) Summary: - MSVC_Z7_OVERRIDE has already handled in CMakeLists.txt. No need to process it for once more in the Python scripts. - Option MSVC_Z7_OVERRIDE should be visible to the user only if MSVC is used. - Move the setting of "/EHa" flag to CMakeLists.txt, where other MSVC-specific flags are processed. This also further prepares the removal of redundant cflags setup in Python build scripts. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23455 Differential Revision: D16542274 Pulled By: ezyang fbshipit-source-id: 4d3b8b07161478bbba8a21feb6ea24c9024e21ac	2019-07-29 08:08:47 -07:00
Ilia Cherniavskii	74f8094ea5	Rename threading build options Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23407 Test Plan: USE_CUDA=0 ATEN_THREADING=TBB USE_OPENMP=0 USE_TBB=1 MKL_THREADING=TBB BLAS=MKL USE_MKLDNN=1 MKLDNN_THREADING=TBB BUILD_BINARY=1 python setup.py develop install --cmake ./build/bin/parallel_info Imported from OSS Differential Revision: D16522538 Pulled By: ilia-cher fbshipit-source-id: 75c4761d93a7f5936f28e4c5eedcd27d8490d0c5	2019-07-26 13:09:14 -07:00
Hong Xu	0b4c0b95e9	For second-time build, let build_type be inferred from CMakeCache.txt. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23323 Test Plan: Imported from OSS Differential Revision: D16517621 Pulled By: ezyang fbshipit-source-id: 22984df214d01246a7868980e148936698940ea8	2019-07-26 08:50:28 -07:00
Jesse Hellemn	39fd264799	Fix lint Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23381 Differential Revision: D16496327 Pulled By: pjh5 fbshipit-source-id: 529029544a5f8c8106bcb7cebdc71aee33e3b86c	2019-07-25 10:39:37 -07:00
Hong Xu	82545ecc71	Specify build dir as a global variable in BUILD_DIR in the build system. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23318 Test Plan: Imported from OSS Differential Revision: D16493987 Pulled By: ezyang fbshipit-source-id: 497e9dd924280f61dde095b4f2b50f5402d9da97	2019-07-25 07:19:47 -07:00
Hong Xu	915261c8be	Let users pass CMake-specific options starting with CMAKE_ to CMake. (#22776 ) Summary: This should make it more convenient to follow https://github.com/pytorch/pytorch/issues/8433's suggestion. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22776 Differential Revision: D16493553 Pulled By: ezyang fbshipit-source-id: 852f4779e70f84a4c9f7bab4c2ae4927248ffc93	2019-07-25 07:19:44 -07:00
Hong Xu	f91b19c2aa	Do not explicitly set USE_FBGEMM in tools/setup_helpers/cmake.py (#23314 ) Summary: Instead, defer its default value to CMakeLists.txt NO_FBGEMM has already been handled in tools/setup_helpers/env.py (although deprecated) Pull Request resolved: https://github.com/pytorch/pytorch/pull/23314 Differential Revision: D16493580 Pulled By: ezyang fbshipit-source-id: 7255eb1df5e8a6dd0362507d68da0986a9ed46e2	2019-07-25 07:11:52 -07:00
Hong Xu	fd1d06e317	Let Python build scripts accept both CMAKE_BUILD_TYPE and the oldschool DEBUG and REL_WITH_DEB_INFO variables. (#22875 ) Summary: Currently the build type is decided by the environment variable DEBUG and REL_WITH_DEB_INFO. This commit also lets CMAKE_BUILD_TYPE be effective. This makes the interface more consistent with CMake. This also prepares https://github.com/pytorch/pytorch/issues/22776. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22875 Differential Revision: D16281663 Pulled By: ezyang fbshipit-source-id: 952f92aad85ff59f1c7abe8256eca8a4a0936026	2019-07-24 08:07:47 -07:00
Hong Xu	60c46dd4df	Let CMake handle NCCL detection instead of our handcrafted Python script. (#22930 ) Summary: --- How does the current code subsume all detections in the deleted `nccl.py`? - The dependency of `USE_NCCL` on the OS and `USE_CUDA` is handled as dependency options in `CMakeLists.txt`. - The main NCCL detection happens in [FindNCCL.cmake](`8377d4b32c/cmake/Modules/FindNCCL.cmake`), which is called by [nccl.cmake](`8377d4b32c/cmake/External/nccl.cmake`). When `USE_SYSTEM_NCCL` is false, the previous Python code defer the detection to `find_package(NCCL)`. The change in `nccl.cmake` retains this. - `USE_STATIC_NCCL` in the previous Python code simply changes the name of the detected library. This is done in `IF (USE_STATIC_NCCL)`. - Now we only need to look at how the lines below line 20 in `nccl.cmake` are subsumed. These lines list paths to header and library directories that NCCL headers and libraries may reside in and try to search these directories for the key header and library files in turn. These are done by `find_path` for headers and `find_library` for the library files in `FindNCCL.cmake`. * The call of [find_path](https://cmake.org/cmake/help/v3.8/command/find_path.html) (Search for `NO_DEFAULT_PATH` in the link) by default searches for headers in `<prefix>/include` for each `<prefix>` in `CMAKE_PREFIX_PATH` and `CMAKE_SYSTEM_PREFIX_PATH`. Like the Python code, this commit sets `CMAKE_PREFIX_PATH` to search for `<prefix>` in `NCCL_ROOT_DIR` and home to CUDA. `CMAKE_SYSTEM_PREFIX_PATH` includes the standard directories such as `/usr/local` and `/usr`. `NCCL_INCLUDE_DIR` is also specifically handled. * Similarly, the call of [find_library](https://cmake.org/cmake/help/v3.8/command/find_library.html) (Search for `NO_DEFAULT_PATH` in the link) by default searches for libraries in directories including `<prefix>/lib` for each `<prefix>` in `CMAKE_PREFIX_PATH` and `CMAKE_SYSTEM_PREFIX_PATH`. But it also handles the edge cases intended to be solved in the Python code more properly: - It only searches for `<prefix>/lib64` (and `<prefix>/lib32`) if it is appropriate on the system. - It only searches for `<prefix>/lib/<arch>` for the right `<arch>`, unlike the Python code searches for `lib/<arch>` in a generic way (e.g., the Python code searches for `/usr/lib/x86_64-linux-gnu` but in reality systems have `/usr/lib/x86_64-some-customized-name-linux-gnu`, see https://unix.stackexchange.com/a/226180/38242 ). --- Regarding for relevant issues: - https://github.com/pytorch/pytorch/issues/12063 and https://github.com/pytorch/pytorch/issues/2877: These are properly handled, as explained in the updated comment. - https://github.com/pytorch/pytorch/issues/2941 does not changes NCCL detection specifically for Windows (it changed CUDA detection). - `b7e258f81e` A versioned library detection is added, but the order is reversed: The unversioned library becomes preferred. This is because normally unversioned libraries are linked to versioned libraries and preferred by users, and local installation by users are often unversioned. Like the document of [find_library](https://cmake.org/cmake/help/v3.8/command/find_library.html) suggests: > When using this to specify names with and without a version suffix, we recommend specifying the unversioned name first so that locally-built packages can be found before those provided by distributions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22930 Differential Revision: D16440275 Pulled By: ezyang fbshipit-source-id: 11fe80743d4fe89b1ed6f96d5d996496e8ec01aa	2019-07-23 08:45:51 -07:00
Edward Yang	798d5d9771	Revert D16281714: Add sanity checks for NCCL detection. Differential Revision: D16281714 Original commit changeset: 396bcbf099bd fbshipit-source-id: a22cc112d1b6a62d689f9d8a7f93e8be3abe2a44	2019-07-16 13:58:27 -07:00
Hong Xu	e2046f8c1d	Add sanity checks for NCCL detection. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22819 Test Plan: Imported from OSS Differential Revision: D16281714 Pulled By: ezyang fbshipit-source-id: 396bcbf099bd07b996cf779c6b43092096b52d90	2019-07-16 11:32:32 -07:00
Edward Yang	ccb28939bf	Revert D16222539: [pytorch][PR] Let users pass CMake-specific options starting with CMAKE_ to CMake. Differential Revision: D16222539 Original commit changeset: 1cc6e69c85cd fbshipit-source-id: c79d68976ac1047c54b32c093429b23e9482cd8f	2019-07-12 07:57:57 -07:00
Hong Xu	612eed31a9	Let users pass CMake-specific options starting with CMAKE_ to CMake. (#22776 ) Summary: This should make it more convenient to follow https://github.com/pytorch/pytorch/issues/8433's suggestion. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22776 Differential Revision: D16222539 Pulled By: ezyang fbshipit-source-id: 1cc6e69c85cdf0d7f8074653445410d85746847c	2019-07-12 07:28:32 -07:00
Hong Xu	e1fdf8a46f	Add comments about adding new build options. (#22641 ) Summary: Also revert the change of cmake.py in `c97829d701` . The comments are added to prevent future similar incidents in the future (which has occurred a couple of times in the past). Pull Request resolved: https://github.com/pytorch/pytorch/pull/22641 Differential Revision: D16171763 Pulled By: ezyang fbshipit-source-id: 5a65f9fbb3c1c798ebd25521932bfde0ad3d16fc	2019-07-09 16:41:46 -07:00
Supriya Rao	c97829d701	Adding FC and Relu QNNPACK ops to C10 registry (#22174 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22174 This is a preliminary change outlining the approach we plan to follow to integrate QNNPACK operators into the pytorch backend. The operators will not be made visible to the user in the python world, so ultimately we will have a function that calls qnnpack backend based on the environment being run on. The goal of the project is to integrate QNNPACK library with PyTorch to achieve good performance for quantized mobile models. Reviewed By: ljk53 Differential Revision: D15806325 fbshipit-source-id: c14e1d864ac94570333a7b14031ea231d095c2ae	2019-07-08 14:21:42 -07:00
peter	ce8c9d9bd5	Fix cuda detection script (#22527 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/22507 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22527 Differential Revision: D16126220 Pulled By: ezyang fbshipit-source-id: eb05141282b0f058324da1b3d3cb34566f222a67	2019-07-08 07:06:59 -07:00
Hong Xu	a6441c00d6	Remove build variable NCCL_EXTERNAL (#22467 ) Summary: It's always set to equal USE_NCCL, we made Gloo depending on Caffe2 NCCL build. See `30da84fbe1` Pull Request resolved: https://github.com/pytorch/pytorch/pull/22467 Differential Revision: D16098581 Pulled By: ezyang fbshipit-source-id: f706ec7cebc2e6315bafca013b669f5a72e04815	2019-07-02 15:36:44 -07:00

1 2

66 Commits