pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Luca Wehrstedt	b213041df3	Also install c10d headers with .h extension (#73422 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73422 Fixes https://github.com/pytorch/pytorch/issues/73421 ghstack-source-id: 149978120 Test Plan: None Reviewed By: cbalioglu Differential Revision: D34475711 fbshipit-source-id: 9e4d1d57021cbff51f53762b32bbfffbf3f81c4c (cherry picked from commit 72ff35e28242132cf20e538d43ad3b63b3e497b1)	2022-02-28 08:39:10 +00:00
Nikita Shulga	dc5cda0cca	Update min python version to 3.7 in setup.py and mypy configs (#71494 ) Summary: As Python-3.6 have reached EOL Pull Request resolved: https://github.com/pytorch/pytorch/pull/71494 Reviewed By: atalman Differential Revision: D33667509 Pulled By: malfet fbshipit-source-id: ab1f03085cfb9161df77ba5ce373b81f5e7ef3ae (cherry picked from commit `60343166d9`)	2022-01-20 00:03:57 +00:00
Taylor Robie	ebc66bfeea	[Profiler] Pull helper methods into dedicated file. (And start `torch/csrc/profiler` folder. (#69255 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69255 One thing that I've found as I optimize profier is that there's a lot of intermingled code, where the kineto profiler relies on the legacy (autograd) profiler for generic operations. This made optimization hard because I had to manage too many complex dependencies. (Exaserbated by the USE_KINETO #ifdef's sprinkled around.) This PR is the first of several to restructure the profiler(s) so the later optimizations go in easier. Test Plan: Unit tests Reviewed By: aaronenyeshi Differential Revision: D32671972 fbshipit-source-id: efa83b40dde4216f368f2a5fa707360031a85707	2021-12-16 10:33:47 -08:00
Peter Bell	4829dcea09	Codegen: Generate seperate headers per operator (#68247 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68247 This splits `Functions.h`, `Operators.h`, `NativeFunctions.h` and `NativeMetaFunctions.h` into seperate headers per operator base name. With `at::sum` as an example, we can include: ```cpp <ATen/core/sum.h> // Like Functions.h <ATen/core/sum_ops.h> // Like Operators.h <ATen/core/sum_native.h> // Like NativeFunctions.h <ATen/core/sum_meta.h> // Like NativeMetaFunctions.h ``` The umbrella headers are still being generated, but all they do is include from the `ATen/ops' folder. Further, `TensorBody.h` now only includes the operators that have method variants. Which means files that only include `Tensor.h` don't need to be rebuilt when you modify function-only operators. Currently there are about 680 operators that don't have method variants, so this is potentially a significant win for incremental builds. Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D32596272 Pulled By: albanD fbshipit-source-id: 447671b2b6adc1364f66ed9717c896dae25fa272	2021-12-14 06:40:08 -08:00
Jithun Nair	8dfdc3df82	[ROCm] Refactor how to specify AMD gpu targets using PYTORCH_ROCM_ARCH (#61706 ) Summary: Remove all hardcoded AMD gfx targets PyTorch build and Magma build will use rocm_agent_enumerator as backup if PYTORCH_ROCM_ARCH env var is not defined PyTorch extensions will use same gfx targets as the PyTorch build, unless PYTORCH_ROCM_ARCH env var is defined torch.cuda.get_arch_list() now works for ROCm builds PyTorch CI dockers will continue to be built for gfx900 and gfx906 for now. PYTORCH_ROCM_ARCH env var can be a space or semicolon separated list of gfx archs eg. "gfx900 gfx906" or "gfx900;gfx906" cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH Pull Request resolved: https://github.com/pytorch/pytorch/pull/61706 Reviewed By: seemethere Differential Revision: D32735862 Pulled By: malfet fbshipit-source-id: 3170e445e738e3ce373203e1e4ae99c84e645d7d	2021-12-13 15:41:40 -08:00
Michael Suo	ad182479b0	[deploy] docs (#69251 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69251 This adds some actual documentation for deploy, which is probably useful since we told everyone it was experimentally available so they will probably be looking at what the heck it is. It also wires up various compoenents of the OSS build to actually work when used from an external project. Differential Revision: D32783312 D32783312 Test Plan: Imported from OSS Reviewed By: wconstab Pulled By: suo fbshipit-source-id: c5c0a1e3f80fa273b5a70c13ba81733cb8d2c8f8	2021-12-01 21:55:18 -08:00
Eli Uriegas	f398320e0d	packaging: Include lazy headers in package_data (#68817 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68817 Looks like these files are getting used by downstream xla so we need to include them in our package_data Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D32622241 Pulled By: seemethere fbshipit-source-id: 7b64e5d4261999ee58bc61185bada6c60c2bb5cc	2021-11-29 08:29:48 -08:00
Can Balioglu	6e640a0acf	Revise the socket implementation of c10d (#68226 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68226 Note that this PR is unusually big due to the urgency of the changes. Please reach out to me in case you wish to have a "pair" review. This PR introduces a major refactoring of the socket implementation of the C10d library. A big portion of the logic is now contained in the `Socket` class and a follow-up PR will further consolidate the remaining parts. As of today the changes in this PR offer: - significantly better error handling and much more verbose logging (see the example output below) - explicit support for IPv6 and dual-stack sockets - correct handling of signal interrupts - better Windows support A follow-up PR will consolidate `send`/`recv` logic into `Socket` and fully migrate to non-blocking sockets. ## Example Output ``` [I logging.h:21] The client socket will attempt to connect to an IPv6 address on (127.0.0.1, 29501). [I logging.h:21] The client socket is attempting to connect to [localhost]:29501. [W logging.h:28] The server socket on [localhost]:29501 is not yet listening (Error: 111 - Connection refused), retrying... [I logging.h:21] The server socket will attempt to listen on an IPv6 address. [I logging.h:21] The server socket is attempting to listen on [::]:29501. [I logging.h:21] The server socket has started to listen on [::]:29501. [I logging.h:21] The client socket will attempt to connect to an IPv6 address on (127.0.0.1, 29501). [I logging.h:21] The client socket is attempting to connect to [localhost]:29501. [I logging.h:21] The client socket has connected to [localhost]:29501 on [localhost]:42650. [I logging.h:21] The server socket on [::]:29501 has accepted a connection from [localhost]:42650. [I logging.h:21] The client socket has connected to [localhost]:29501 on [localhost]:42722. [I logging.h:21] The server socket on [::]:29501 has accepted a connection from [localhost]:42722. [I logging.h:21] The client socket will attempt to connect to an IPv6 address on (127.0.0.1, 29501). [I logging.h:21] The client socket is attempting to connect to [localhost]:29501. [I logging.h:21] The client socket has connected to [localhost]:29501 on [localhost]:42724. [I logging.h:21] The server socket on [::]:29501 has accepted a connection from [localhost]:42724. [I logging.h:21] The client socket will attempt to connect to an IPv6 address on (127.0.0.1, 29501). [I logging.h:21] The client socket is attempting to connect to [localhost]:29501. [I logging.h:21] The client socket has connected to [localhost]:29501 on [localhost]:42726. [I logging.h:21] The server socket on [::]:29501 has accepted a connection from [localhost]:42726. ``` ghstack-source-id: 143501987 Test Plan: Run existing unit and integration tests on devserver, Fedora, Ubuntu, macOS Big Sur, Windows 10. Reviewed By: Babar, wilson100hong, mrshenli Differential Revision: D32372333 fbshipit-source-id: 2204ffa28ed0d3683a9cb3ebe1ea8d92a831325a	2021-11-16 20:49:25 -08:00
Robert Blackwell	cee4e8f35d	Add FlexiBLAS build support per #64752 (#64815 ) Summary: To enable building torch+dependencies, set WITH_BLAS=flexi BLAS=FlexiBLAS Fixes https://github.com/pytorch/pytorch/issues/64752 Pull Request resolved: https://github.com/pytorch/pytorch/pull/64815 Reviewed By: jbschlosser Differential Revision: D31997745 Pulled By: albanD fbshipit-source-id: db208d59002f5896608a03132616400f09d972aa	2021-10-28 11:28:00 -07:00
Nikita Shulga	77beccaedb	Do not build PyTorch with caffe2 by default (#66658 ) Summary: CAFFE2 has been deprecated for a while, but still included in every PyTorch build. We should stop building it by default, although CI should still validate that caffe2 code is buildable. Build even fewer dependencies when compiling mobile builds without Caffe2 Introduce `TEST_CAFFE2` in torch.common.utils Skip `TestQuantizedEmbeddingOps` and `TestJit.test_old_models_bc` is code is compiled without Caffe2 Should be landed after https://github.com/pytorch/builder/pull/864 Pull Request resolved: https://github.com/pytorch/pytorch/pull/66658 Reviewed By: driazati, seemethere, janeyx99 Differential Revision: D31669156 Pulled By: malfet fbshipit-source-id: 1cc45e2d402daf913a4685eb9f841cc3863e458d	2021-10-21 20:32:47 -07:00
Can Balioglu	65e6194aeb	Introduce the torchrun entrypoint (#64049 ) Summary: This PR introduces a new `torchrun` entrypoint that simply "points" to `python -m torch.distributed.run`. It is shorter and less error-prone to type and gives a nicer syntax than a rather cryptic `python -m ...` command line. Along with the new entrypoint the documentation is also updated and places where `torch.distributed.run` are mentioned are replaced with `torchrun`. cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse agolynski SciPioneer H-Huang mrzzd cbalioglu gcramer23 Pull Request resolved: https://github.com/pytorch/pytorch/pull/64049 Reviewed By: cbalioglu Differential Revision: D30584041 Pulled By: kiukchung fbshipit-source-id: d99db3b5d12e7bf9676bab70e680d4b88031ae2d	2021-08-26 20:17:48 -07:00
Peter Bell	560cd88195	Kill THCUNN (#63429 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63429 Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D30441308 Pulled By: ngimel fbshipit-source-id: 3ae342a2f8d5c7f8827b637c4055c5d1b0a1be26	2021-08-23 12:07:16 -07:00
Nikita Shulga	6e5d065b2b	Add pocketfft as submodule (#62841 ) Summary: Using https://github.com/mreineck/pocketfft Also delete explicit installation of pocketfft during the build as it will be available via submodule Limit PocketFFT support to cmake-3.10 or newer, as `set_source_files_properties` does not seem to work as expected with cmake-3.5 Partially addresses https://github.com/pytorch/pytorch/issues/62821 Pull Request resolved: https://github.com/pytorch/pytorch/pull/62841 Reviewed By: seemethere Differential Revision: D30140441 Pulled By: malfet fbshipit-source-id: d1a1cf1b43375321f5ec5b3d0b538f58082f7825	2021-08-17 15:29:56 -07:00
Shen Li	1022443168	Revert D30279364: [codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: revert-hammer Differential Revision: D30279364 (`b004307252`) Original commit changeset: c1ed77dfe43a fbshipit-source-id: eab50857675c51e0088391af06ec0ecb14e2347e	2021-08-12 11:45:01 -07:00
Zsolt Dollenstein	b004307252	[codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: manual inspection & sandcastle Reviewed By: zertosh Differential Revision: D30279364 fbshipit-source-id: c1ed77dfe43a3bde358f92737cd5535ae5d13c9a	2021-08-12 10:58:35 -07:00
Can Balioglu	7565039ee9	Support system-provided Intel TBB (#61934 ) Summary: This PR: (1) enables the use of a system-provided Intel TBB for building PyTorch, (2) removes `tbb:task_scheduler_init` references since it has been removed from TBB a while ago (3) marks the implementation of `_internal_set_num_threads` with a TODO as it requires a revision that fixes its thread allocation logic. Tested with `test/run_test`; no new tests are introduced since there are no behavioral changes (removal of `tbb::task_scheduler_init` has no impact on the runtime behavior). Pull Request resolved: https://github.com/pytorch/pytorch/pull/61934 Reviewed By: malfet Differential Revision: D29805416 Pulled By: cbalioglu fbshipit-source-id: 22042b428b57b8fede9dfcc83878d679a19561dd	2021-08-02 07:39:00 -07:00
imaginary-person	9e53c823b8	Add AVX512 support in ATen & remove AVX support (#61903 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61903 ### Remaining Tasks - [ ] Collate results of benchmarks on two Intel Xeon machines (with & without CUDA, to check if CPU throttling causes issues with GPUs) - make graphs, including Roofline model plots (Intel Advisor can't make them with libgomp, though, but with Intel OpenMP). ### Summary 1. This draft PR produces binaries with with 3 types of ATen kernels - default, AVX2, AVX512 . Using the environment variable `ATEN_AVX512_256=TRUE` also results in 3 types of kernels, but the compiler can use 32 ymm registers for AVX2, instead of the default 16. ATen kernels for `CPU_CAPABILITY_AVX` have been removed. 2. `nansum` is not using AVX512 kernel right now, as it has poorer accuracy for Float16, than does AVX2 or DEFAULT, whose respective accuracies aren't very good either (#59415). It was more convenient to disable AVX512 dispatch for all dtypes of `nansum` for now. 3. On Windows , ATen Quantized AVX512 kernels are not being used, as quantization tests are flaky. If `--continue-through-failure` is used, then `test_compare_model_outputs_functional_static` fails. But if this test is skipped, `test_compare_model_outputs_conv_static` fails. If both these tests are skipped, then a third one fails. These are hard to debug right now due to not having access to a Windows machine with AVX512 support, so it was more convenient to disable AVX512 dispatch of all ATen Quantized kernels on Windows for now. 4. One test is currently being skipped - [test_lstm` in `quantization.bc](https://github.com/pytorch/pytorch/issues/59098) - It fails only on Cascade Lake machines, irrespective of the `ATEN_CPU_CAPABILITY` used, because FBGEMM uses `AVX512_VNNI` on machines that support it. The value of `reduce_range` should be used as `False` on such machines. The list of the changes is at https://gist.github.com/imaginary-person/4b4fda660534f0493bf9573d511a878d. Credits to ezyang for proposing `AVX512_256` - these use AVX2 intrinsics but benefit from 32 registers, instead of the 16 ymm registers that AVX2 uses. Credits to limo1996 for the initial proposal, and for optimizing `hsub_pd` & `hadd_pd`, which didn't have direct AVX512 equivalents, and are being used in some kernels. He also refactored `vec/functional.h` to remove duplicated code. Credits to quickwritereader for helping fix 4 failing complex multiplication & division tests. ### Testing 1. `vec_test_all_types` was modified to test basic AVX512 support, as tests already existed for AVX2. Only one test had to be modified, as it was hardcoded for AVX2. 2. `pytorch_linux_bionic_py3_8_gcc9_coverage_test1` & `pytorch_linux_bionic_py3_8_gcc9_coverage_test2` are now using `linux.2xlarge` instances, as they support AVX512. They were used for testing AVX512 kernels, as AVX512 kernels are being used by default in both of the CI checks. Windows CI checks had already been using machines with AVX512 support. ### Would the downclocking caused by AVX512 pose an issue? I think it's important to note that AVX2 causes downclocking as well, and the additional downclocking caused by AVX512 may not hamper performance on some Skylake machines & beyond, because of the double vector-size. I think that [this post with verifiable references is a must-read](https://community.intel.com/t5/Software-Tuning-Performance/Unexpected-power-vs-cores-profile-for-MKL-kernels-on-modern-Xeon/m-p/1133869/highlight/true#M6450). Also, AVX512 would _probably not_ hurt performance on a high-end machine, [but measurements are recommended](https://lemire.me/blog/2018/09/07/avx-512-when-and-how-to-use-these-new-instructions/). In case it does, `ATEN_AVX512_256=TRUE` can be used for building PyTorch, as AVX2 can then use 32 ymm registers instead of the default 16. [FBGEMM uses `AVX512_256` only on Xeon D processors](https://github.com/pytorch/FBGEMM/pull/209), which are said to have poor AVX512 performance. This [official data](https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-scalable-spec-update.pdf) is for the Intel Skylake family, and the first link helps understand its significance. Cascade Lake & Ice Lake SP Xeon processors are said to be even better when it comes to AVX512 performance. Here is the corresponding data for [Cascade Lake](https://cdrdv2.intel.com/v1/dl/getContent/338848) - ![CASCADE LAKE AVX2](https://user-images.githubusercontent.com/76181208/120666172-ffec3f80-c451-11eb-8ea1-8933ccc12a1b.PNG) ![CASCADE LAKE AVX512](https://user-images.githubusercontent.com/76181208/120666190-04b0f380-c452-11eb-9faa-38d233c874c8.PNG) The corresponding data isn't publicly available for Intel Xeon SP 3rd gen (Ice Lake SP), but [Intel mentioned that the 3rd gen has frequency improvements pertaining to AVX512](https://newsroom.intel.com/wp-content/uploads/sites/11/2021/04/3rd-Gen-Intel-Xeon-Scalable-Platform-Press-Presentation-281884.pdf). Ice Lake SP machines also have 48 KB L1D caches, so that's another reason for AVX512 performance to be better on them. ### Is PyTorch always faster with AVX512? No, but then PyTorch is not always faster with AVX2 either. Please refer to #60202. The benefit from vectorization is apparent with with small tensors that fit in caches or in kernels that are more compute heavy. For instance, AVX512 or AVX2 would yield no benefit for adding two 64 MB tensors, but adding two 1 MB tensors would do well with AVX2, and even more so with AVX512. It seems that memory-bound computations, such as adding two 64 MB tensors can be slow with vectorization (depending upon the number of threads used), as the effects of downclocking can then be observed. Original pull request: https://github.com/pytorch/pytorch/pull/56992 Reviewed By: soulitzer Differential Revision: D29266289 Pulled By: ezyang fbshipit-source-id: 2d5e8d1c2307252f22423bbc14f136c67c3e6184	2021-07-22 08:51:49 -07:00
zhouzhuojie	6107cf3750	Add --jobs 0 for git submodule update (#61311 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61311 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61152 Some related docs about `submodule.fetchJobs` https://git-scm.com/docs/git-config#Documentation/git-config.txt-submodulefetchJobs ``` time git submodule update --init --recursive ________________________________________________________ Executed in 243.20 secs fish external usr time 49.64 secs 213.00 micros 49.64 secs sys time 29.27 secs 795.00 micros 29.27 secs ``` ``` time git submodule update --init --recursive --jobs 4 ________________________________________________________ Executed in 143.04 secs fish external usr time 51.06 secs 246.00 micros 51.06 secs sys time 30.96 secs 742.00 micros 30.96 secs ``` ``` time git submodule update --init --recursive --jobs 8 ________________________________________________________ Executed in 124.64 secs fish external usr time 51.76 secs 264.00 micros 51.76 secs sys time 30.49 secs 739.00 micros 30.49 secs ``` ``` time git submodule update --init --recursive --jobs 0 # use all online cpus ________________________________________________________ Executed in 129.75 secs fish external usr time 51.64 secs 181.00 micros 51.64 secs sys time 31.49 secs 781.00 micros 31.49 secs ``` Test Plan: Imported from OSS Reviewed By: 1ntEgr8 Differential Revision: D29560875 Pulled By: zhouzhuojie fbshipit-source-id: 556027dffe744c66428075a8a1bf64683930aaaf	2021-07-07 16:28:18 -07:00
Nathan John Sircombe	bf00d26deb	Enables builds with Compute Library backend for oneDNN (#55913 ) Summary: Since v1.7, oneDNN (MKL-DNN) has supported the use of Compute Library for the Arm architeture to provide optimised convolution primitives on AArch64. This change enables the use of Compute Library in the PyTorch build. Following the approach used to enable the use of CBLAS in MKLDNN, It is enabled by setting the env vars USE_MKLDNN and USE_MKLDNN_ACL. The location of the Compute Library build must be set useing `ACL_ROOT_DIR`. This is an extension of the work in https://github.com/pytorch/pytorch/pull/50400 which added support for the oneDNN/MKL-DNN backend on AArch64. _Note: this assumes that Compute Library has been built and installed at ACL_ROOT_DIR. Compute library can be downloaded here: `https://github.com/ARM-software/ComputeLibrary`_ Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/55913 Reviewed By: ailzhang Differential Revision: D28559516 Pulled By: malfet fbshipit-source-id: 29d24996097d0a54efc9ab754fb3f0bded290005	2021-05-20 07:43:56 -07:00
Winston Smith	47c566ebb1	Rename namespace `vec256` to `vec`, struct `Vec256` to `Vectorized` (and other related classes/structs) (#58438 ) Summary: In order to make it more convenient for maintainers to review the ATen AVX512 implementation, the namespace `vec256` is being renamed to `vec` in this PR, as modifying 77 files & creating 2 new files only took a few minutes, as these changes aren't significant, so fewer files would've to be reviewed while reviewing https://github.com/pytorch/pytorch/issues/56992. The struct `Vec256` is not being renamed to `Vec`, but `Vectorized` instead, because there are some `using Vec=` statements in the codebase, so renaming it to `Vectorized` was more convenient. However, I can still rename it to `Vec`, if required. ### Changes made in this PR - Created `aten/src/ATen/cpu/vec` with subdirectory `vec256` (vec512 would be added via https://github.com/pytorch/pytorch/issues/56992). The changes were made in this manner - 1. First, a script was run to rename `vec256` to `vec` & `Vec` to `Vectorized` - ``` # Ref: https://stackoverflow.com/a/20721292 cd aten/src grep -rli 'vec256\/vec256\.h' * \| xargs -i@ sed -i 's/vec256\/vec256\.h/vec\/vec\.h/g' @ grep -rli 'vec256\/functional\.h' * \| xargs -i@ sed -i 's/vec256\/functional\.h/vec\/functional\.h/g' @ grep -rli 'vec256\/intrinsics\.h' * \| xargs -i@ sed -i 's/vec256\/intrinsics\.h/vec\/vec256\/intrinsics\.h/g' @ grep -rli 'namespace vec256' * \| xargs -i@ sed -i 's/namespace vec256/namespace vec/g' @ grep -rli 'Vec256' * \| xargs -i@ sed -i 's/Vec256/Vectorized/g' @ grep -rli 'vec256\:\:' * \| xargs -i@ sed -i 's/vec256\:\:/vec\:\:/g' @ grep -rli 'at\:\:vec256' * \| xargs -i@ sed -i 's/at\:\:vec256/at\:\:vec/g' @ cd ATen/cpu mkdir vec mv vec256 vec cd vec/vec256 grep -rli 'cpu\/vec256\/' * \| xargs -i@ sed -i 's/cpu\/vec256\//cpu\/vec\/vec256\//g' @ grep -rli 'vec\/vec\.h' * \| xargs -i@ sed -i 's/vec\/vec\.h/vec\/vec256\.h/g' @ ``` 2. `vec256` & `VEC256` were replaced with `vec` & `VEC` respectively in 4 CMake files. 3. In `pytorch_vec/aten/src/ATen/test/`, `vec256_test_all_types.h` & `vec256_test_all_types.cpp` were renamed. 4. `pytorch_vec/aten/src/ATen/cpu/vec/vec.h` & `pytorch_vec/aten/src/ATen/cpu/vec/functional.h` were created. Both currently have one line each & would have 5 when AVX512 support would be added for ATen. Pull Request resolved: https://github.com/pytorch/pytorch/pull/58438 Reviewed By: malfet Differential Revision: D28509615 Pulled By: ezyang fbshipit-source-id: 63840df5f23b3b59e203d25816e2977c6a901780	2021-05-19 16:04:36 -07:00
Xiang Gao	6c70cbedb6	step 0 of cuDNN v8 convolution API integration (#51390 ) Summary: This PR is step 0 of adding PyTorch convolution bindings using the cuDNN frontend. The cuDNN frontend is the recommended way of using cuDNN v8 API. It is supposed to have faster release cycles, so that, for example, if people find a specific kernel has a bug, they can report it, and that kernel will be blocked in the cuDNN frontend and frameworks could just update that submodule without the need for waiting for a whole cuDNN release. The work is not complete, and this PR is only step 0. What this PR does: - Add cudnn-frontend as a submodule. - Modify cmake to build that submodule. - Add bindings for convolution forward in `Conv_v8.cpp`, which is disabled by a macro by default. - Tested manually by enabling the macro and run `test_nn.py`. All tests pass except those mentioned below. What this PR doesn't: - Only convolution forward, no backward. The backward will use v7 API. - No 64bit-indexing support for some configuration. This is a known issue of cuDNN, and will be fixed in a later cuDNN version. PyTorch will not implement any workaround for issue, but instead, v8 API should be disabled on problematic cuDNN versions. - No test beyond PyTorch's unit tests. - Not tested for correctness on real models. - Not benchmarked for performance. - Benchmark cache is not thread-safe. (This is marked as `FIXME` in the code, and will be fixed in a follow-up PR) - cuDNN benchmark is not supported. - There are failing tests, which will be resolved later: ``` FAILED test/test_nn.py::TestNNDeviceTypeCUDA::test_conv_cudnn_nhwc_cuda_float16 - AssertionError: False is not true : Tensors failed to compare as equal!With rtol=0.001 and atol=1e-05, found 32 element(s) (out of 32) whose difference(s) exceeded the margin of error (in... FAILED test/test_nn.py::TestNNDeviceTypeCUDA::test_conv_cudnn_nhwc_cuda_float32 - AssertionError: False is not true : Tensors failed to compare as equal!With rtol=1.3e-06 and atol=1e-05, found 32 element(s) (out of 32) whose difference(s) exceeded the margin of error (... FAILED test/test_nn.py::TestNNDeviceTypeCUDA::test_conv_large_cuda - RuntimeError: CUDNN_BACKEND_OPERATION: cudnnFinalize Failed cudnn_status: 9 FAILED test/test_nn.py::TestNN::test_Conv2d_depthwise_naive_groups_cuda - AssertionError: False is not true : Tensors failed to compare as equal!With rtol=0 and atol=1e-05, found 64 element(s) (out of 64) whose difference(s) exceeded the margin of error (including 0 an... FAILED test/test_nn.py::TestNN::test_Conv2d_deterministic_cudnn - RuntimeError: not supported yet FAILED test/test_nn.py::TestNN::test_ConvTranspose2d_groups_cuda_fp32 - RuntimeError: cuDNN error: CUDNN_STATUS_BAD_PARAM FAILED test/test_nn.py::TestNN::test_ConvTranspose2d_groups_cuda_tf32 - RuntimeError: cuDNN error: CUDNN_STATUS_BAD_PARAM ``` Although this is not a complete implementation of cuDNN v8 API binding, I still want to merge this first. This would allow me to do small and incremental work, for the ease of development and review. Pull Request resolved: https://github.com/pytorch/pytorch/pull/51390 Reviewed By: malfet Differential Revision: D28513167 Pulled By: ngimel fbshipit-source-id: 9cc20c9dec5bbbcb1f94ac9e0f59b10c34f62740	2021-05-19 12:54:09 -07:00
davidriazati@fb.com	c44cbc63cc	Ignore more compiler warnings, unify WERROR options (#56630 ) Summary: This adds some more compiler warnings ignores for everything that happens on a standard CPU build (CUDA builds still have a bunch of warnings so we can't turn on `-Werror` everywhere yet). ](https://our.intern.facebook.com/intern/diff/28005063/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/56630 Pulled By: driazati Reviewed By: malfet Differential Revision: D28005063 fbshipit-source-id: 541ed415eb0470ddf7e08c22c5eb6da9db26e9a0	2021-04-29 21:20:29 -07:00
davidriazati@fb.com	4b96fc060b	Remove distutils (#57040 ) Summary: [distutils](https://docs.python.org/3/library/distutils.html) is on its way out and will be deprecated-on-import for Python 3.10+ and removed in Python 3.12 (see [PEP 632](https://www.python.org/dev/peps/pep-0632/)). There's no reason for us to keep it around since all the functionality we want from it can be found in `setuptools` / `sysconfig`. `setuptools` includes a copy of most of `distutils` (which is fine to use according to the PEP), that it uses under the hood, so this PR also uses that in some places. Fixes #56527 Pull Request resolved: https://github.com/pytorch/pytorch/pull/57040 Pulled By: driazati Reviewed By: nikithamalgifb Differential Revision: D28051356 fbshipit-source-id: 1ca312219032540e755593e50da0c9e23c62d720	2021-04-29 12:10:11 -07:00
David Reiss	89377e3e45	model_dump tool for model inspection (#56868 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56868 See __init__.py for a summary of the tool. The following sections are present in this initial version - Model Size. Show the total model size, as well as a breakdown by stored files, compressed files, and zip overhead. (I expect this breakdown to be a bit more useful once data.pkl is compressed.) - Model Structure. This is basically the output of `show_pickle(data.pkl)`, but as a hierarchical structure. Some structures cause this view to crash right now, but it can be improved incrementally. - Zip Contents. This is basically the output of `zipinfo -l`. - Code. This is the TorchScript code. It's integrated with a blame window at the bottom, so you can click "Blame Code", then click a bit of code to see where it came from (based on the debug_pkl). This currently doesn't render properly if debug_pkl is missing or incomplete. - Extra files (JSON). JSON dumps of each json file under /extra/, up to a size limit. - Extra Pickles. For each .pkl file in the model, we safely unpickle it with `show_pickle`, then render it with `pprint` and include it here if the size is not too large. We aren't able to install the pprint hack that thw show_pickle CLI uses, so we get one-line rendering for custom objects, which is not very useful. Built-in types look fine, though. In particular, bytecode.pkl seems to look fine (and we hard-code that file to ignore the size limit). I'm checking in the JS dependencies to avoid a network dependency at runtime. They were retrieved from the following URLS, then passed through a JS minifier: https://unpkg.com/htm@3.0.4/dist/htm.module.js?module https://unpkg.com/preact@10.5.13/dist/preact.module.js?module Test Plan: Manually ran on a few models I had lying around. Mostly tested in Chrome, but I also poked around in Firefox. Reviewed By: dhruvbird Differential Revision: D28020849 Pulled By: dreiss fbshipit-source-id: 421c30ed7ca55244e9fda1a03b8aab830466536d	2021-04-28 07:33:10 -07:00
Bert Maher	90f848572c	NNC depthwise conv2d implementation (#54920 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54920 Add a depthwise convolution implementation and reasonably good schedules for 3x3 stride=1,2. ghstack-source-id: 126076113 Test Plan: new tensorexpr test: Conv.DepthwiseConv2D Reviewed By: ZolotukhinM Differential Revision: D27413745 fbshipit-source-id: 833da6072b655fbe2b679704e9d56a08e1bf7e7e	2021-04-08 21:56:53 -07:00
Nikita Shulga	14a2501786	Update max-version in setup.py to 3.9 (#54690 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54690 Reviewed By: seemethere Differential Revision: D27330462 Pulled By: malfet fbshipit-source-id: db332acf5aa5bff67af2bef777935f2387bc963c	2021-03-26 12:45:03 -07:00
Nikita Shulga	e8e570e9c5	[MacOS] Cross compile stub when building for M1 on x86 (#54046 ) Summary: Also rename `CROSS_COMPILE_ARM` to `CROSS_COMPILE_ARM64` Pull Request resolved: https://github.com/pytorch/pytorch/pull/54046 Reviewed By: walterddr Differential Revision: D27071928 Pulled By: malfet fbshipit-source-id: 9143cd5d110ed67f0609f0a4bbb20922012ee665	2021-03-16 00:24:09 -07:00
James Butterworth	37ab711822	Adding learning rate schedulers to C++ API (#52268 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/50577 Learning rate schedulers had not yet been implemented for the C++ API. This pull request introduces the learning rate scheduler base class and the StepLR subclass. Furthermore, it modifies the existing OptimizerOptions such that the learning rate scheduler can modify the learning rate. Pull Request resolved: https://github.com/pytorch/pytorch/pull/52268 Reviewed By: mrshenli Differential Revision: D26818387 Pulled By: glaringlee fbshipit-source-id: 2b28024a8ea7081947c77374d6d643fdaa7174c1	2021-03-10 23:09:51 -08:00
Nikita Shulga	7e6a84d238	Add logic to auto-fetch submodules (#53461 ) Summary: In setup.py add logic to: - Get list of submodules from .gitmodules file - Auto-fetch submodules if none of them has been fetched In CI: - Test this on non-docker capable OSes (Windows and Mac) - Use shallow submodule checkouts whenever possible Pull Request resolved: https://github.com/pytorch/pytorch/pull/53461 Reviewed By: ezyang Differential Revision: D26871119 Pulled By: malfet fbshipit-source-id: 8b23d6a4fcf04446eac11446e0113819476ef6ea	2021-03-09 09:13:35 -08:00
Andrew Millspaugh	1fc8831322	Add missing tensor header (#53489 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53489 It appears that D26675801 (`1fe6a6507e`) broke Glow builds (and probably other instals) with the inclusion of the python_arg_parser include. That dep lives in a directory of its own and was not included in the setup.py. Test Plan: OSS tests should catch this. Reviewed By: ngimel Differential Revision: D26878180 fbshipit-source-id: 70981340226a9681bb9d5420db56abba75e7f0a5	2021-03-08 12:05:17 -08:00
Rong Rong (AI Infra)	f58f7b786c	add distributed backend options in setup.py (#53214 ) Summary: Currently there's only one indicator for build_ext regarding distributed backend `USE_DISTRIBUTED`. However one can build with selective backends. adding the 3 distributed backend option in setup.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/53214 Test Plan: Set the 3 options in environment and locally ran `python setup.py build_ext` Reviewed By: janeyx99 Differential Revision: D26818259 Pulled By: walterddr fbshipit-source-id: 688e8f83383d10ce23ee1f019be33557ce5cce07	2021-03-05 14:39:36 -08:00
Nikita Shulga	272dfc7bb9	Add MANIFEST.in (#52908 ) Summary: Do not build PyTorch if `setup.py` is called with 'sdist' option Regenerate bundled license while sdist package is being built Refactor `check_submodules` out of `build_deps` and check that submodules project are present during source package build stage. Test that sdist package is configurable during `asan-build` step Fixes https://github.com/pytorch/pytorch/issues/52843 Pull Request resolved: https://github.com/pytorch/pytorch/pull/52908 Reviewed By: walterddr Differential Revision: D26685176 Pulled By: malfet fbshipit-source-id: 972a40ae36e194c0b4e0fc31c5e1af1e7a815185	2021-03-01 18:28:25 -08:00
Nikita Shulga	a0a1bb074b	Make NumPy dependency dynamic (#52794 ) Summary: Move NumPy initialization from `initModule()` to singleton inside `torch::utils::is_numpy_available()` function. This singleton will print a warning, that NumPy integration is not available, rather than fails to import torch altogether. The warning be printed only once, and will look something like the following: ``` UserWarning: Failed to initialize NumPy: No module named 'numpy.core' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:66.) ``` This is helpful if PyTorch was compiled with wrong NumPy version, of NumPy is not commonly available on the platform (which is often the case on AARCH64 or Apple M1) Test that PyTorch is usable after numpy is uninstalled at the end of `_test1` CI config. Pull Request resolved: https://github.com/pytorch/pytorch/pull/52794 Reviewed By: seemethere Differential Revision: D26650509 Pulled By: malfet fbshipit-source-id: a2d98769ef873862c3704be4afda075d76d3ad06	2021-02-25 19:45:00 -08:00
mattip	9cbefad83f	concantenate LICENSE files when building a wheel (#51634 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/50695 I checked locally that the concatenated license file appears at `torch-<version>.dist-info/LICENSE` in the wheel. Pull Request resolved: https://github.com/pytorch/pytorch/pull/51634 Reviewed By: zhangguanheng66 Differential Revision: D26225550 Pulled By: walterddr fbshipit-source-id: 830c59fb7aea0eb50b99e295edddad9edab6ba3a	2021-02-08 08:28:46 -08:00
Ilia Cherniavskii	e34992ebee	Set USE_KINETO=1 (#49897 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49897 Resend of https://github.com/pytorch/pytorch/pull/49201 Test Plan: see 49201 Reviewed By: malfet Differential Revision: D25717102 Pulled By: ilia-cher fbshipit-source-id: 5e794a7f5fe160ca64ac9d190c4fd3e8f1e443e6	2021-01-22 00:09:21 -08:00
Richard Barnes	a5339b9d7c	Drop unused imports from leftovers (#49953 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49953 From ``` ./python/libcst/libcst codemod remove_unused_imports.RemoveUnusedImportsWithGlean --no-format caffe2/ ``` Test Plan: Standard sandcastle tests Reviewed By: xush6528 Differential Revision: D25727348 fbshipit-source-id: b3feef80b9b4b535f1bd4060dace5b1a50bd5e69	2021-01-04 16:31:48 -08:00
Protonu Basu	4c5a4dbb8c	[Tensorexpr]Copying header files in tensorexpr dir (#49933 ) Summary: Previously header files from jit/tensorexpr were not copied, this PR should enable copying. This will allow other OSS projects like Glow to used TE. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49933 Reviewed By: Krovatkin, mruberry Differential Revision: D25725927 Pulled By: protonu fbshipit-source-id: 9d5a0586e9b73111230cacf044cd7e8f5c600ce9	2020-12-29 15:18:52 -08:00
Ilia Cherniavskii	72b00a8a52	Revert D25480770: Set USE_KINETO=1 Test Plan: revert-hammer Differential Revision: D25480770 (`1a92802bde`) Original commit changeset: 037cd774f554 fbshipit-source-id: 6a6062195033ca91fcc0cfa1e890e47efc774ac1	2020-12-18 07:06:28 -08:00
Ilia Cherniavskii	1a92802bde	Set USE_KINETO=1 (#49201 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49201 This unblocks kineto profiler for 1.8 release. This PR supercedes https://github.com/pytorch/pytorch/pull/48391 Note: this will somewhat increase the size of linux server binaries, bc we add libkineto.a and libcupti_static.a: -rw-r--r-- 1 jenkins jenkins 1107502 Dec 10 21:16 build/lib/libkineto.a -rw-r--r-- 1 root root 13699658 Nov 13 2019 /usr/local/cuda/lib64/libcupti_static.a Test Plan: CI https://github.com/pytorch/pytorch/pull/48391 Imported from OSS Reviewed By: ngimel Differential Revision: D25480770 fbshipit-source-id: 037cd774f5547d9918d6055ef5cc952a54e48e4c	2020-12-18 01:48:10 -08:00
Taylor Robie	0225d3dc9d	Add support for timing C++ snippets. (#47864 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47864 Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D25199262 Pulled By: robieta fbshipit-source-id: 1c2114628ed543fba4f403bf49c065f4d71388e2	2020-12-01 20:03:14 -08:00
Taylor Robie	17ea11259a	Rework compat bindings. (#47863 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47863 Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D25199261 Pulled By: robieta fbshipit-source-id: 0a4a0409ddb75c1bf66cd31d67b55080227b1679	2020-12-01 20:03:11 -08:00
Nikita Shulga	2dff0b3e91	Fix typos in comments (#48316 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48316 Reviewed By: walterddr, mrshenli Differential Revision: D25125123 Pulled By: malfet fbshipit-source-id: 6f31e5456cc078cc61b288191f1933711acebba0	2020-11-24 10:56:40 -08:00
Ilia Cherniavskii	f2da18af14	Add USE_KINETO build option (#45888 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45888 Adding USE_LIBKINETO build option Test Plan: USE_KINETO=1 USE_CUDA=1 USE_MKLDNN=1 BLAS=MKL BUILD_BINARY=1 python setup.py develop install --cmake Reviewed By: Chillee Differential Revision: D25142221 Pulled By: ilia-cher fbshipit-source-id: d1634a8f9599604ff511fac59b9072854289510c	2020-11-21 20:20:32 -08:00
Nikita Shulga	d7c8d3cccb	Remove references to `typing` module from setup.py (#47677 ) Summary: It is part of core Python-3.6.2+ Fixes https://github.com/pytorch/pytorch/issues/47596 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47677 Reviewed By: walterddr Differential Revision: D24860188 Pulled By: malfet fbshipit-source-id: ad72b433a4493ebe5caca97c2e8a9d4b3c8172d4	2020-11-12 10:04:38 -08:00
peter	a08e8dd70c	Fix python 3.9 builds on Windows (#47602 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/47460. Pull Request resolved: https://github.com/pytorch/pytorch/pull/47602 Reviewed By: heitorschueroff Differential Revision: D24832487 Pulled By: malfet fbshipit-source-id: 8846caeac5e767e8066470d5c981218f147c88dc	2020-11-09 12:39:28 -08:00
Nikita Shulga	6f6025183f	Skip iomp5 emebedding if torch_cpu could not be found (#47390 ) Summary: This would be the case when package is build for local development rather than for installation Pull Request resolved: https://github.com/pytorch/pytorch/pull/47390 Reviewed By: janeyx99 Differential Revision: D24738416 Pulled By: malfet fbshipit-source-id: 22bd676bc46e5d50a09539c969ce56d37cfe5952	2020-11-04 14:22:53 -08:00
Nikita Shulga	3a0024574d	Do not delete rpath from torch.dylib on Darwin (#47337 ) Summary: Fixes CI regressions introduced by https://github.com/pytorch/pytorch/issues/47262 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47337 Reviewed By: ngimel Differential Revision: D24721954 Pulled By: malfet fbshipit-source-id: 395b037b29c0fc3b62ca50bba9be940ad72e0c5b	2020-11-03 22:36:35 -08:00
Nikita Shulga	ca61b061f3	Update minimum supported Python version to 3.6.2 (#47314 ) Summary: As typing.NoReturn is used in the codebase Pull Request resolved: https://github.com/pytorch/pytorch/pull/47314 Reviewed By: seemethere Differential Revision: D24712847 Pulled By: malfet fbshipit-source-id: f0692d408316d630bc11f1ee881b695437fb47d4	2020-11-03 13:32:07 -08:00
Nikita Shulga	14194e4f23	Embed `libiomp5.dylib` into wheel package (#47262 ) Summary: libiomp runtime is the only external dependency OS X package has if compiled with MKL Copy it to the stage directory from one of the available rpathes And remove all absolute rpathes, since project shoudl have none Fixes https://github.com/pytorch/pytorch/issues/38607 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47262 Reviewed By: walterddr Differential Revision: D24705094 Pulled By: malfet fbshipit-source-id: 9f588a3ec3c6c836c8986d858fb53df815a506c8	2020-11-03 13:00:30 -08:00
Nikita Shulga	8c39f198b4	Fix typo in setup.py (#46921 ) Summary: Also, be a bit future-proof in support version list Pull Request resolved: https://github.com/pytorch/pytorch/pull/46921 Reviewed By: seemethere Differential Revision: D24568733 Pulled By: malfet fbshipit-source-id: ae34f8da1ed39b80dc34db0b06e4ef142104a3ff	2020-10-27 13:14:41 -07:00
Nikita Shulga	a38eeeff5c	Make setup.py python 2 friendly (#46317 ) Summary: import print_function to make setup.py invoked by Python2 print human readable error: ``` % python2 setup.py Python 2 has reached end-of-life and is no longer supported by PyTorch. ``` Also, remove `future` from the list of the PyTorch package install dependencies Pull Request resolved: https://github.com/pytorch/pytorch/pull/46317 Reviewed By: walterddr, bugra Differential Revision: D24305004 Pulled By: malfet fbshipit-source-id: 9181186170562384dd2c0e6a8ff0b1e93508f221	2020-10-14 16:37:06 -07:00
Nikita Shulga	45de2ee3ac	Remove Python version upper boundary check (#46315 ) Summary: This prevents setup.py from erroring out when Python-3.9 is used Fixes https://github.com/pytorch/pytorch/issues/46314 Pull Request resolved: https://github.com/pytorch/pytorch/pull/46315 Reviewed By: heitorschueroff Differential Revision: D24304846 Pulled By: malfet fbshipit-source-id: 573a88ea8c1572d7d8a9991539effb3c228bffc9	2020-10-14 07:36:55 -07:00
Eli Uriegas	615013edcb	setup: Dataclasses only when < 3.7 (#45844 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45844 Someone pointed out that dataclasses were actually added to the python stdlib in 3.7 and not 3.8, so bumping down the dependency on dataclasses from 3.8 -> 3.7 makes sense here Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Test Plan: Imported from OSS Reviewed By: walterddr, malfet Differential Revision: D24113367 Pulled By: seemethere fbshipit-source-id: 03d2d93f7d966d48a30a8e2545fd07dfe63b4fb3	2020-10-05 13:29:21 -07:00
Michael Suo	18253f4a48	Fix BUILD_CAFFE2 if FBGEMM and NNPACK are not built (#45610 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45610 Also add to the usual documentation places that this option exists. Test Plan: Imported from OSS Reviewed By: gmagogsfm Differential Revision: D24058199 Pulled By: suo fbshipit-source-id: 81574fbd042f47587e2c7820c726fac0f68af2a7	2020-10-01 14:58:55 -07:00
Eli Uriegas	5959de3aeb	setup: Only include dataclasses for py < 3.8 (#45611 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45611 dataclasses was made a standard library item in 3.8 Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Test Plan: Imported from OSS Reviewed By: walterddr Differential Revision: D24031740 Pulled By: seemethere fbshipit-source-id: 15bdf1fe0d8de9b8ba7912e4a651f06b18d516ee	2020-10-01 14:52:28 -07:00
Bugra Akyildiz	27c7158166	Remove __future__ imports for legacy Python2 supports (#45033 ) Summary: There is a module called `2to3` which you can target for future specifically to remove these, the directory of `caffe2` has the most redundant imports: ```2to3 -f future -w caffe2``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/45033 Reviewed By: seemethere Differential Revision: D23808648 Pulled By: bugra fbshipit-source-id: 38971900f0fe43ab44a9168e57f2307580d36a38	2020-09-23 17:57:02 -07:00
Daily, Jeff	b98ac20849	install ATen/native/cuda and hip headers (#45097 ) Summary: The ATen/native/cuda headers were copied to torch/include, but then not included in the final package. Further, add ATen/native/hip headers to the installation, as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/45097 Reviewed By: mruberry Differential Revision: D23831006 Pulled By: malfet fbshipit-source-id: ab527928185faaa912fd8cab208733a9b11a097b	2020-09-22 17:43:47 -07:00
Michael Suo	161490d441	Move `torch/version.py` generation to cmake (#44577 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44577 I would like to to move this to cmake so that I can depend on it happening from other parts of the build. This PR pulls out the logic for determining the version string and writing the version file into its own module. `setup.py` still receives the version string and uses it as before, but now the code for writing out `torch/version.py` lives in a custom command in torch/CMakeLists.txt I noticed a small inconsistency in how version info is populated. `TORCH_BUILD_VERSION` is populated from `setup.py` at configuration time, while `torch/version.py` is written at build time. So if, e.g. you configured cmake on a certain git rev, then built it in on another, the two versions would be inconsistent. This does not appear to matter, so I opted to preserve the existing behavior. Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D23734781 Pulled By: suo fbshipit-source-id: 4002c9ec8058503dc0550f8eece2256bc98c03a4	2020-09-16 15:49:22 -07:00
Alexander Grund	d23f3170ef	Remove pybind11 from required submodules (#44278 ) Summary: This can be taken from the system in which case it is not used from the submodule. Hence the check here limits the usage unnecessarily ccing malfet Pull Request resolved: https://github.com/pytorch/pytorch/pull/44278 Reviewed By: malfet Differential Revision: D23568552 Pulled By: ezyang fbshipit-source-id: 7fd2613251567f649b12eca0b1fe7663db9cb58d	2020-09-09 08:07:13 -07:00
Edward Yang	6ea89166bd	Rewrite of ATen code generator (#42629 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42629 How to approach reviewing this diff: - The new codegen itself lives in `tools/codegen`. Start with `gen.py`, then read `model.py` and them the `api/` folder. The comments at the top of the files describe what is going on. The CLI interface of the new codegen is similar to the old one, but (1) it is no longer necessary to explicitly specify cwrap inputs (and now we will error if you do so) and (2) the default settings for source and install dir are much better; to the extent that if you run the codegen from the root source directory as just `python -m tools.codegen.gen`, something reasonable will happen. - The old codegen is (nearly) entirely deleted; every Python file in `aten/src/ATen` was deleted except for `common_with_cwrap.py`, which now permanently finds its home in `tools/shared/cwrap_common.py` (previously cmake copied the file there), and `code_template.py`, which now lives in `tools/codegen/code_template.py`. We remove the copying logic for `common_with_cwrap.py`. - All of the inputs to the old codegen are deleted. - Build rules now have to be adjusted to not refer to files that no longer exist, and to abide by the (slightly modified) CLI. - LegacyTHFunctions files have been generated and checked in. We expect these to be deleted as these final functions get ported to ATen. The deletion process is straightforward; just delete the functions of the ones you are porting. There are 39 more functions left to port. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D23183978 Pulled By: ezyang fbshipit-source-id: 6073ba432ad182c7284a97147b05f0574a02f763	2020-08-31 09:00:22 -07:00
Hong Xu	9063bcee04	Don't proceed into setup.py too far if Python version is unsupported (#42870 ) Summary: This prevents confusing errors when the interpreter encounters some syntax errors in the middle. Pull Request resolved: https://github.com/pytorch/pytorch/pull/42870 Reviewed By: albanD Differential Revision: D23269265 Pulled By: ezyang fbshipit-source-id: 61f62cbe294078ad4a909fa87aa93abd08c26344	2020-08-28 09:04:55 -07:00
Luca Wehrstedt	c30bc6d4d7	Update TensorPipe submodule (#42522 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42522 Main changes: - Consolidated CMake files to have a single entry point, rather than having a specialized one for PyTorch. - Changed the way the preprocessor flags are provided, and changed their name. There were a few instances in PyTorch's CMake files where we were directly adding TensorPipe's source directory as an include path, which however doesn't contain the auto-generated header we now added. We fix that by adding the `tensorpipe` CMake target as a dependency, so that the include paths defined by TensorPipe are used, which contain that auto-generated header. So instead we link those targets to the tensorpipe target in order for them to pick up the correct include directories. I'm turning off SHM and CMA for now because they have never been covered by the CI. I'll enable them in a separate PR so that if they turn out to be flaky we can revert that change without reverting this one. Test Plan: CI Reviewed By: malfet Differential Revision: D22959472 fbshipit-source-id: 1959a41c4a66ef78bf0f3bd5e3964969a2a1bf67	2020-08-06 02:14:58 -07:00
Ralf Gommers	dc1f87c254	Add typing_extensions as a dependency. (#42431 ) Summary: Closes gh-38221. The related pytorch/builder PR: https://github.com/pytorch/builder/pull/475 Pull Request resolved: https://github.com/pytorch/pytorch/pull/42431 Reviewed By: malfet Differential Revision: D22916499 Pulled By: ezyang fbshipit-source-id: c8fe9413b62fc7a6b829fc82aaf32531b55994d1	2020-08-03 20:06:16 -07:00
Nikita Shulga	f00a37dd71	Make setup.py Python-2 syntactically correct (#41960 ) Summary: Import __future__ to make `print(args)` a syntactically correct statement under Python-2 Otherwise, if once accidentally invokes setup.py using Python-2 interpreter they will be greeted by: ``` File "setup.py", line 229 print(args) ^ SyntaxError: invalid syntax ``` instead of: ``` Python 2 has reached end-of-life and is no longer supported by PyTorch. ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/41960 Reviewed By: orionr, seemethere Differential Revision: D22710174 Pulled By: malfet fbshipit-source-id: ffde3ddd585707ba1d39e57e0c6bc9c4c53f8004	2020-07-23 19:10:20 -07:00
Nikita Shulga	883e4c44b2	Raise exception when trying to build PyTorch on 32-bit Windows system (#40321 ) Summary: Makes errors in cases described in https://github.com/pytorch/pytorch/issues/27815 more obvious Pull Request resolved: https://github.com/pytorch/pytorch/pull/40321 Differential Revision: D22198352 Pulled By: malfet fbshipit-source-id: 327d81103c066048dcf5f900fd9083b09942af0e	2020-06-23 16:54:20 -07:00
peter	0f39ed86a7	Cleanup debug info switches with MSVC (#39703 ) Summary: Switch off `/Z7` so that we don't generate debug info in Release and MinSizeRel builds, so that we will probably get smaller static libraries and object files and faster build time Pull Request resolved: https://github.com/pytorch/pytorch/pull/39703 Differential Revision: D21960684 Pulled By: ezyang fbshipit-source-id: 909a237a138183591d667885b13fc311470eed65	2020-06-09 14:11:40 -07:00
Eli Uriegas	b7b7433561	setup: Add long description to wheel packages (#39676 ) Summary: Closes out https://github.com/pytorch/pytorch/issues/38354 For reference: https://packaging.python.org/guides/making-a-pypi-friendly-readme/ Should fill out the PyPI description as well. Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/39676 Reviewed By: malfet Differential Revision: D21940656 Pulled By: seemethere fbshipit-source-id: 6c39500404227047d8f24936db0697fe44a6b9e8	2020-06-08 16:25:39 -07:00
Nikita Shulga	a864dbb360	Make `_C` extension a thin C wrapper (#39375 ) Summary: It just depends on a single `torch_python` library. C library does not depend on standard C++ library and as result it closes https://github.com/pytorch/pytorch/issues/36941 Pull Request resolved: https://github.com/pytorch/pytorch/pull/39375 Reviewed By: orionr Differential Revision: D21840645 Pulled By: malfet fbshipit-source-id: 777c189feee9d6fc686816d92cb9f109b8aac7ca	2020-06-02 13:11:59 -07:00
Meghan Lele	dd7eed5ae4	[JIT] Export JIT backend extension headers in setup.py (#38525 ) Summary: Summary This commit adds the headers required to define and use JIT backends to `package_data` in `setup.py` so that they are exported and copied to the same place as the rest of the headers when PyTorch is installed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/38525 Differential Revision: D21601806 Pulled By: SplitInfinity fbshipit-source-id: 1615dd4047777926e013d7dd14fe427d5ffb8b70	2020-05-15 14:45:08 -07:00
David Reiss	328fc70b84	Remove (most) Python 2 support from setup.py (#35617 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35617 Python 2 has reached end-of-life and is no longer supported by PyTorch. Now we can clean up some cruft that we put in place to support it. Test Plan: CI Differential Revision: D20842883 Pulled By: dreiss fbshipit-source-id: 18dc5219ba99658c0ca7e2f26863df008c420e6a	2020-05-14 10:06:20 -07:00
Edward Yang	6edf340338	Delete torch/__init__.pyi, deferring to direct extension stubs (#38157 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38157 This removes the error prone process of assembling `torch/__init__.pyi` (and frequently forgetting to expose things), since now we can simply rely on the true source file to get things done. Most of the old codegen in gen_pyi.py is now rerouted to various files: - `torch/_C/__init__.pyi` (the dumping pile of all misc bindings) - `torch/_C/_nn.pyi` (NN function bindings) - `torch/_C/_VariableFunctions.pyi` (torch function bindings) `torch.types` grew a bunch more definitions that previously where defined in `torch/__init__.pyi` Some miscellaneous changes - Fixed a bug where we treat single TensorList argument as implying varargs are accepted. This is actually only supported on IntList. This means we can correctly generate a stub for dequantize. - Add missing manual stub for nonzero - Switched torch/onnx/operators.py to directly refer to _C module, since apparently mypy doesn't think that methods prefixed with underscores get reexported. This may be a recurring theme; maybe we need to find a better way to solve it. Because I was really lazy, I dumped namedtuple definitions in both `torch._C` and `torch._C._VariableFunctions`. This is definitely wrong. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D21497400 Pulled By: ezyang fbshipit-source-id: 07b126141c82efaca37be27c07255cb2b9b3f064	2020-05-11 07:20:13 -07:00
Jerry Zhang	0ed7fc581c	[quant][graphmode][refactor] Split quantization.cpp (#37975 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37975 Test Plan: . Imported from OSS Differential Revision: D21468497 fbshipit-source-id: 35cbf98a344ca6e4094d616a4040eacf017fd2de	2020-05-08 12:24:50 -07:00
peter	c5d6f59ab1	Replacing EHa with EHsc (#37235 ) Summary: We should not rely on the async exceptions. Catching C++ only exception is more sensible and may get a boost in both space (1163 MB -> 1073 MB, 0.92x) and performance(51m -> 49m, 0.96x). Pull Request resolved: https://github.com/pytorch/pytorch/pull/37235 Differential Revision: D21256918 Pulled By: ezyang fbshipit-source-id: 572ee96f2e4c48ad13f83409e4e113483b3a457a	2020-04-28 08:20:37 -07:00
Mo Zhou	5b9f7f7b0e	[cmake] Add USE_SYSTEM_{GLOO,FP16,PTHREADPOOL,PSIMD,FXDIV,BENCHMARK} options (#14699 ) (#37277 ) Summary: These options are disabled by default, and are supposed to be used by linux distro developers. With the existing shortcut option USE_SYSTEM_LIBS toggled, these new options will be enabled as well. Additionally, when USE_SYSTEM_LIBS is toggled, setup.py should no longer check the existence of git submodules. ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/37277 Differential Revision: D21256999 Pulled By: ezyang fbshipit-source-id: 84f97d008db5a5e41a289cb7bce94906de3c52cf	2020-04-27 09:37:27 -07:00
Mo Zhou	ff21b15624	cmake: add USE_SYSTEM_{LIBS,CPUINFO,SLEEF} options (#14699 ) (#37137 ) Summary: ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/37137 Differential Revision: D21222632 Pulled By: ezyang fbshipit-source-id: 47624b30f8d07b31a40a26edf665bbec39e45202	2020-04-23 20:43:36 -07:00
Christian Kastner	6df90bcecc	setup.py: Remove conflicting double documentation of USE_FBGEMM (#36993 ) Summary: Line 33+ contains instructions on how to disable use, 108+ on how to enable it. The default in CMakeLists.txt is enabled, so drop the latter. Pull Request resolved: https://github.com/pytorch/pytorch/pull/36993 Differential Revision: D21161793 Pulled By: ngimel fbshipit-source-id: 08c5eecaf8768491f90d4a52c338ecea32a0c35e	2020-04-21 22:33:49 -07:00
David Reiss	3c85f44ce8	Fail setup.py if trying to set up with Python 2 (#35613 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35613 Python 2 has reached end-of-life and is no longer supported by PyTorch. To spare users from a long, doomed setup when trying to use PyTorch with Python 2, detect this case early and fail with a clear message. This commit covers setup.py. Test Plan: Attempted to build PyTorch with Python 2 and saw a clear error quickly. Differential Revision: D20842881 Pulled By: dreiss fbshipit-source-id: caaaa0dbff83145ff668bd25df6d7d4b3ce12e47	2020-04-16 10:24:03 -07:00
peter	b9260bdb7b	Don't build deps for `python setup.py egg_info` (#36208 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/36207. Pull Request resolved: https://github.com/pytorch/pytorch/pull/36208 Differential Revision: D20919649 Pulled By: ezyang fbshipit-source-id: b5242a540181b29dba8987fb5f00332e1e81ca98	2020-04-08 09:02:01 -07:00
Sebastian Messmer	7ee88d61f7	Rename boxing/unboxing files and utilities (#35411 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35411 The file and class names in ATen/core/boxing were quite confusing. Let's rename them for readability. Also move function schema inference out of the boxing logic into op_registration.h where it belongs. ghstack-source-id: 101539206 Test Plan: waitforsandcastle Differential Revision: D20653621 fbshipit-source-id: 6a79c73d5758bee1e072d543c030913b18a69c7c	2020-04-04 14:13:28 -07:00
Feng Tian	762270c51f	add c10d dynamic loading mechanism and unit test (#28068 ) Summary: The original behavior of pytorch c10d only supports built-in c10d backends, such as nccl/gloo/mpi. This patch is used to extend the c10d capability to support dynamically loading 3rd party communication libraries which are derived from ProcessGroup base class. related RFC is in: https://github.com/pytorch/pytorch/issues/27955 Through this way, user just need specify a 3rd party c10d backend name when invoking torch.distributed.init_process_group(). The proposed logic will try to load corresponding c10d backend cpp extension automatically. as for how to develop a new 3rd party c10d backend through cpp extension, pls refer to test/cpp_extensions/cpp_c10d_extension.cpp Pull Request resolved: https://github.com/pytorch/pytorch/pull/28068 Differential Revision: D19174838 Pulled By: agolynski fbshipit-source-id: 3409a504a43ce7260e6f9d1207c00e87471fac62	2020-04-02 15:46:51 -07:00
Orion Reblitz-Richardson	f101949390	Remove python2 support from setup.py (#35539 ) Summary: As a followup to https://github.com/pytorch/pytorch/pull/35042 this removes python2 from setup.py and adds Python 3.8 to the list of supported versions. We're already testing this in CircleCI. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35539 Differential Revision: D20709060 Pulled By: orionr fbshipit-source-id: 5d40bc14cb885374fec370fc7c5d3cde8769039a	2020-03-27 14:33:11 -07:00
pinzhenx	bd604cb5b7	Upgrade MKL-DNN to DNNL v1.2 (#32422 ) Summary: ## Motivation This PR upgrades MKL-DNN from v0.20 to DNNL v1.2 and resolves https://github.com/pytorch/pytorch/issues/30300. DNNL (Deep Neural Network Library) is the new brand of MKL-DNN, which improves performance, quality, and usability over the old version. This PR focuses on the migration of all existing functionalities, including minor fixes, performance improvement and code clean up. It serves as the cornerstone of our future efforts to accommodate new features like OpenCL support, BF16 training, INT8 inference, etc. and to let the Pytorch community derive more benefits from the Intel Architecture. <br> ## What's included? Even DNNL has many breaking changes to the API, we managed to absorb most of them in ideep. This PR contains minimalist changes to the integration code in pytorch. Below is a summary of the changes: <br> General: 1. Replace op-level allocator with global-registered allocator ``` // before ideep::sum::compute<AllocForMKLDNN>(scales, {x, y}, z); // after ideep::sum::compute(scales, {x, y}, z); ``` The allocator is now being registeted at `aten/src/ATen/native/mkldnn/IDeepRegistration.cpp`. Thereafter all tensors derived from the `cpu_engine` (by default) will use the c10 allocator. ``` RegisterEngineAllocator cpu_alloc( ideep::engine::cpu_engine(), [](size_t size) { return c10::GetAllocator(c10::DeviceType::CPU)->raw_allocate(size); }, [](void* p) { c10::GetAllocator(c10::DeviceType::CPU)->raw_deallocate(p); } ); ``` ------ 2. Simplify group convolution We had such a scenario in convolution where ideep tensor shape mismatched aten tensor: when `groups > 1`, DNNL expects weights tensors to be 5-d with an extra group dimension, e.g. `goihw` instead of `oihw` in 2d conv case. As shown below, a lot of extra checks came with this difference in shape before. Now we've completely hidden this difference in ideep and all tensors are going to align with pytorch's definition. So we could safely remove these checks from both aten and c2 integration code. ``` // aten/src/ATen/native/mkldnn/Conv.cpp if (w.ndims() == x.ndims() + 1) { AT_ASSERTM( groups > 1, "Only group _mkldnn_conv2d weights could have been reordered to 5d"); kernel_size[0] = w.get_dim(0) * w.get_dim(1); std::copy_n( w.get_dims().cbegin() + 2, x.ndims() - 1, kernel_size.begin() + 1); } else { std::copy_n(w.get_dims().cbegin(), x.ndims(), kernel_size.begin()); } ``` ------ 3. Enable DNNL built-in cache Previously, we stored DNNL jitted kernels along with intermediate buffers inside ideep using an LRU cache. Now we are switching to the newly added DNNL built-in cache, and no longer caching buffers in order to reduce memory footprint. This change will be mainly reflected in lower memory usage from memory profiling results. On the code side, we removed couple of lines of `op_key_` that depended on the ideep cache before. ------ 4. Use 64-bit integer to denote dimensions We changed the type of `ideep::dims` from `vector<int32_t>` to `vector<int64_t>`. This renders ideep dims no longer compatible with 32-bit dims used by caffe2. So we use something like `{stride_.begin(), stride_.end()}` to cast parameter `stride_` into a int64 vector. <br> Misc changes in each commit: Commit: change build options Some build options were slightly changed, mainly to avoid name collisions with other projects that include DNNL as a subproject. In addition, DNNL built-in cache is enabled by option `DNNL_ENABLE_PRIMITIVE_CACHE`. Old \| New -- \| -- WITH_EXAMPLE \| MKLDNN_BUILD_EXAMPLES WITH_TEST \| MKLDNN_BUILD_TESTS MKLDNN_THREADING \| MKLDNN_CPU_RUNTIME MKLDNN_USE_MKL \| N/A (not use MKL anymore) ------ Commit: aten reintegration - aten/src/ATen/native/mkldnn/BinaryOps.cpp Implement binary ops using new operation `binary` provided by DNNL - aten/src/ATen/native/mkldnn/Conv.cpp Clean up group convolution checks Simplify conv backward integration - aten/src/ATen/native/mkldnn/MKLDNNConversions.cpp Simplify prepacking convolution weights - test/test_mkldnn.py Fixed an issue in conv2d unit test: it didn't check conv results between mkldnn and aten implementation before. Instead, it compared the mkldnn with mkldnn as the default cpu path will also go into mkldnn. Now we use `torch.backends.mkldnn.flags` to fix this issue - torch/utils/mkldnn.py Prepack weight tensor on module `__init__` to achieve better performance significantly ------ Commit: caffe2 reintegration - caffe2/ideep/ideep_utils.h Clean up unused type definitions - caffe2/ideep/operators/adam_op.cc & caffe2/ideep/operators/momentum_sgd_op.cc Unify tensor initialization with `ideep::tensor::init`. Obsolete `ideep::tensor::reinit` - caffe2/ideep/operators/conv_op.cc & caffe2/ideep/operators/quantization/int8_conv_op.cc Clean up group convolution checks Revamp convolution API - caffe2/ideep/operators/conv_transpose_op.cc Clean up group convolution checks Clean up deconv workaround code ------ Commit: custom allocator - Register c10 allocator as mentioned above <br><br> ## Performance We tested inference on some common models based on user scenarios, and most performance numbers are either better than or on par with DNNL 0.20. ratio: new / old \| Latency (batch=1 4T) \| Throughput (batch=64 56T) -- \| -- \| -- pytorch resnet18 \| 121.4% \| 99.7% pytorch resnet50 \| 123.1% \| 106.9% pytorch resnext101_32x8d \| 116.3% \| 100.1% pytorch resnext50_32x4d \| 141.9% \| 104.4% pytorch mobilenet_v2 \| 163.0% \| 105.8% caffe2 alexnet \| 303.0% \| 99.2% caffe2 googlenet-v3 \| 101.1% \| 99.2% caffe2 inception-v1 \| 102.2% \| 101.7% caffe2 mobilenet-v1 \| 356.1% \| 253.7% caffe2 resnet101 \| 100.4% \| 99.8% caffe2 resnet152 \| 99.8% \| 99.8% caffe2 shufflenet \| 141.1% \| 69.0% † caffe2 squeezenet \| 98.5% \| 99.2% caffe2 vgg16 \| 136.8% \| 100.6% caffe2 googlenet-v3 int8 \| 100.0% \| 100.7% caffe2 mobilenet-v1 int8 \| 779.2% \| 943.0% caffe2 resnet50 int8 \| 99.5% \| 95.5% _Configuration: Platform: Skylake 8180 Latency Test: 4 threads, warmup 30, iteration 500, batch size 1 Throughput Test: 56 threads, warmup 30, iteration 200, batch size 64_ † Shufflenet is one of the few models that require temp buffers during inference. The performance degradation is an expected issue since we no longer cache any buffer in the ideep. As for the solution, we suggest users opt for caching allocator like jemalloc as a drop-in replacement for system allocator in such heavy workloads. Pull Request resolved: https://github.com/pytorch/pytorch/pull/32422 Test Plan: Perf results: https://our.intern.facebook.com/intern/fblearner/details/177790608?tab=Experiment%20Results 10% improvement for ResNext with avx512, neutral on avx2 More results: https://fb.quip.com/ob10AL0bCDXW#NNNACAUoHJP Reviewed By: yinghai Differential Revision: D20381325 Pulled By: dzhulgakov fbshipit-source-id: 803b906fd89ed8b723c5fcab55039efe3e4bcb77	2020-03-26 22:07:59 -07:00
Pavel Belevich	11a40410e7	pybind11 type_caster for at::Generator and custom RNG python test (#34774 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34774 This PR provides pybind11's `type_caster<at::Generator>` that allows mapping `at::Generator` instance returned from user-defined method to python `torch::Generator`, defined as `THPGenerator ` c++ class. This allows 1) defining custom RNG in c++ extension 2) using custom RNG in python code. `TestRNGExtension.test_rng` shows how to use custom RNG defined in `rng_extension.cpp` Test Plan: Imported from OSS Differential Revision: D20549451 Pulled By: pbelevich fbshipit-source-id: 312a6deccf8228f7f60695bbf95834620d52f5eb	2020-03-22 10:57:35 -07:00
Nikita Shulga	d3f5045bf5	PyTorch should always depend on `future` (#35057 ) Summary: Because `past` is used in `caffe2.python.core` Pull Request resolved: https://github.com/pytorch/pytorch/pull/35057 Test Plan: CI Differential Revision: D20547042 Pulled By: malfet fbshipit-source-id: cad2123c7b88271fea37f21e616df551075383a8	2020-03-19 17:31:47 -07:00
Eli Uriegas	275f5c8049	setup.py: Add numpy as required for install_requires (#34510 ) Summary: Was originally not a requirement but we should add it back here since it's required on import and we require it anyways for our conda packages. Tested with: ``` ❯ pkginfo -f requires_dist *.whl requires_dist: ['numpy'] ``` Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/34510 Differential Revision: D20352125 Pulled By: seemethere fbshipit-source-id: 383e396fe500ed7043d83c3df57d1772d0fff1e6	2020-03-17 13:31:55 -07:00
Nikita Shulga	6d790c3611	Mark PyTorch incompatible with python-3.6.0 (#34724 ) Summary: Per https://github.com/pytorch/pytorch/issues/19161 PyTorch is incompatible with 3.6.0 due to the missing `PySlice_Unpack` Pull Request resolved: https://github.com/pytorch/pytorch/pull/34724 Test Plan: CI + try to load pytorch binary using python-3.6.0 Differential Revision: D20449052 Pulled By: malfet fbshipit-source-id: 2c787fc64f5d1377c7f935ad2f3c77f46723d7dd	2020-03-13 15:22:34 -07:00
Nikita Shulga	dd7cec680c	Do not use clang if it can not parse system extensions (#34549 ) Summary: Attempt to build pytorch with ASAN on system with gcc-8 fails due to the mismatch system compilation flags. Address the issue by using original compiler to build `torch._C` extension Pull Request resolved: https://github.com/pytorch/pytorch/pull/34549 Test Plan: Run `.jenkins/pytorch/build-asan.sh` on FC-30 Differential Revision: D20373781 Pulled By: malfet fbshipit-source-id: 041c8d25f96b4436385a5e0eb6fc46e9b5fdf3f1	2020-03-10 15:40:08 -07:00
xiaobing.zhang	b678256bfb	Move glu to Aten(CPU) (#33179 ) Summary: This PR move glu to Aten(CPU). Test script: ``` import torch import torch.nn.functional as F import time torch.manual_seed(0) def _time(): if torch.cuda.is_available(): torch.cuda.synchronize() return time.time() device = "cpu" #warm up for n in [10, 100, 1000, 10000]: input = torch.randn(128, n, requires_grad=True, device=device) grad_output = torch.ones(128, n // 2, device=device) for i in range(1000): output = F.glu(input) output.backward(grad_output) for n in [10, 100, 1000, 10000]: fwd_t = 0 bwd_t = 0 input = torch.randn(128, n, requires_grad=True, device=device) grad_output = torch.ones(128, n // 2, device=device) for i in range(10000): t1 = _time() output = F.glu(input) t2 = _time() output.backward(grad_output) t3 = _time() fwd_t = fwd_t + (t2 -t1) bwd_t = bwd_t + (t3 - t2) fwd_avg = fwd_t / 10000 * 1000 bwd_avg = bwd_t / 10000 * 1000 print("input size(128, %d) forward time is %.2f (ms); backwad avg time is %.2f (ms)." % (n, fwd_avg, bwd_avg)) ``` Test device: skx-8180. Before: ``` input size(128, 10) forward time is 0.04 (ms); backwad avg time is 0.08 (ms). input size(128, 100) forward time is 0.06 (ms); backwad avg time is 0.14 (ms). input size(128, 1000) forward time is 0.11 (ms); backwad avg time is 0.31 (ms). input size(128, 10000) forward time is 1.52 (ms); backwad avg time is 2.04 (ms). ``` After: ``` input size(128, 10) forward time is 0.02 (ms); backwad avg time is 0.05 (ms). input size(128, 100) forward time is 0.04 (ms); backwad avg time is 0.09 (ms). input size(128, 1000) forward time is 0.07 (ms); backwad avg time is 0.17 (ms). input size(128, 10000) forward time is 0.13 (ms); backwad avg time is 1.03 (ms). ``` Fix https://github.com/pytorch/pytorch/issues/24707, https://github.com/pytorch/pytorch/issues/24708. Pull Request resolved: https://github.com/pytorch/pytorch/pull/33179 Differential Revision: D19839835 Pulled By: VitalyFedyunin fbshipit-source-id: e4d3438556a1068da2c4a7e573d6bbf8d2a6e2b9	2020-02-28 14:54:38 -08:00
Michael Suo	dbe850af5b	[jit] do the code reorg (#33851 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33851 Rationale and context described in #33828. Script to reproduce the move: https://gist.github.com/suo/16cbefaaeb67ca5a7c6caffd49b7f6e9 ghstack-source-id: 99079645 Test Plan: Make sure CI passes Reviewed By: jamesr66a Differential Revision: D20133869 fbshipit-source-id: 390e9241a9c85366d9005c492ac31f10aa96488e	2020-02-27 13:02:51 -08:00
Pavel Belevich	b1c85dd916	Custom RNG DispatchKey (#32325 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32325 The purpose of this PR is to enable PyTorch dispatching on `at::Generator` parameters and demonstrate how it can be used in cpp extensions to implement custom RNG. 1. `CustomRNGKeyId` value added to DispatchKey enum and `DispatchKeySet key_set_` added to `at::Generator` 2. The overloaded `operator()(at::Generator gen)` added to MultiDispatchKeySet. 3. The existing CPUGenerator and CUDAGenerator class are supplied with CPUTensorId and CUDATensorId dispatch keys 4. The implementation of CPU's `cauchy_kernel`(as an example, because it's already moved to ATen) was templatized and moved to `ATen/native/cpu/DistributionTemplates.h` to make it available for cpp extensions 5. Minor CMake changes to make native/cpu tensors available for cpp extensions 6. RegisterCustomRNG test that demonstrates how CustomCPUGenerator class can be implemented and how custom_rng_cauchy_ native function can be registered to handle Tensor::cauchy_ calls. Test Plan: Imported from OSS Differential Revision: D19604558 Pulled By: pbelevich fbshipit-source-id: 2619f14076cee5742094a0be832d8530bba72728	2020-01-29 11:30:04 -08:00
Pritam Damania	f050b16dd9	Move pytorch distributed tests to separate folder for contbuild. (#30445 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30445 Create distributed and rpc directories under caffe/test for better management of unit tests. Differential Revision: D18702786 fbshipit-source-id: e9daeed0cfb846ef68806f6decfcb57c0e0e3606	2020-01-22 21:16:59 -08:00
ashish	9a4219eb39	Install complete set of headers for ROCm build (#32076 ) Summary: This PR adds a more complete list of pytorch header files to be installed at build time. It also fixes one instance of including a header from local src directory instead of installed directory. A more complete set of headers enable other modules to correctly work with pyTorch built for ROCm. cc: ezyang bddppq iotamudelta Pull Request resolved: https://github.com/pytorch/pytorch/pull/32076 Differential Revision: D19372933 Pulled By: ezyang fbshipit-source-id: 3b5f3241c001fa05ea448c359a706ce9a8214aa0	2020-01-13 08:33:28 -08:00
Edward Yang	4ef9daf7b2	Remove dead CAFFE2_LIBS variable (#31155 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31155 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D19262584 Pulled By: ezyang fbshipit-source-id: 147ac5a9c36e813ea9a2f68b498880942d661be5	2020-01-06 14:39:47 -08:00
zrphercule	c564d794ed	Add ATen/native/ headers to torch target (#30835 ) Summary: We dont have ATen/native/*.h in torch target before, and we would like it to be exposed for external use. Pull Request resolved: https://github.com/pytorch/pytorch/pull/30835 Differential Revision: D18836160 Pulled By: zrphercule fbshipit-source-id: 7330a9c9d8b65f173cc332b1cfeeb18c7dca20a8	2019-12-05 13:24:21 -08:00
Sebastian Messmer	bc2e6d10fa	Back out "Revert D17908478: Switch PyTorch/Caffe2 to C++14" Summary: Original commit changeset: 775d2e29be0b Test Plan: CI Reviewed By: mruberry Differential Revision: D18775520 fbshipit-source-id: a350b3f86b66d97241f208786ee67e9a51172eac	2019-12-03 14:33:43 -08:00
Sebastian Messmer	a2ed50c920	Revert D17908478: Switch PyTorch/Caffe2 to C++14 Test Plan: revert-hammer Differential Revision: D17908478 Original commit changeset: 6e340024591e fbshipit-source-id: 775d2e29be0bc3a0db64f164c8960c44d4877d5d	2019-11-27 14:57:05 -08:00
Sebastian Messmer	d0acc9c085	Switch PyTorch/Caffe2 to C++14 (#30406 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30406 ghstack-source-id: 94642238 Test Plan: waitforsandcastle Differential Revision: D17908478 fbshipit-source-id: 6e340024591ec2c69521668022999df4a33b4ddb	2019-11-27 10:47:31 -08:00
Thomas Viehmann	7889e1e3f9	Add `torch.version.hip` from cmake (#29815 ) Summary: This adds the HIP_VERSION cmake variable as hip_version. This should help detecting ROCm, e.g. in https://github.com/pytorch/pytorch/issues/22091. To parallel CUDA, hip_version is a string. An alternative variant might be to split by '.' and only take the first two parts. The method suffers a bit from ROCm not being as monolithic as CUDA. Pull Request resolved: https://github.com/pytorch/pytorch/pull/29815 Differential Revision: D18532267 Pulled By: bddppq fbshipit-source-id: 1bde4ad0cfacc47bfd1c0945e130921d8575a5bf	2019-11-15 12:03:15 -08:00
Junjie Bai	b0c245d52d	Consolidate the places that find pybind11 include dirs (#29659 ) Summary: Also move the logic that installs the pybind11 headers from setup.py to cmake (to align with other headers). Pull Request resolved: https://github.com/pytorch/pytorch/pull/29659 Differential Revision: D18458208 Pulled By: bddppq fbshipit-source-id: cfd1e74b892d4a65591626ab321780c8c87b810d	2019-11-12 14:51:56 -08:00
zrphercule	eae4a69069	Add quantized fbgemm headers to torch target (#29418 ) Summary: We dont have ATen/native/quantized/cpu/*.h in torch target before, and we would like it to be exposed for external use. Pull Request resolved: https://github.com/pytorch/pytorch/pull/29418 Differential Revision: D18383534 Pulled By: zrphercule fbshipit-source-id: 72c06ae2c10e8cc49e7256c9e9b89288263bbfde	2019-11-08 14:32:19 -08:00
peter	d05da7dad3	Fix virtualenv builds on Windows (#29273 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/29058. Pull Request resolved: https://github.com/pytorch/pytorch/pull/29273 Differential Revision: D18349822 Pulled By: ezyang fbshipit-source-id: c4d76521cc0742d890f22f1d7f32dede5600b651	2019-11-06 09:02:30 -08:00
qzhong0605	50fd20b64a	fix bug on setup.py to include header files on caffe2/utils/math (#28869 ) Summary: This problem is from issue [https://github.com/pytorch/pytorch/issues/28753](https://github.com/pytorch/pytorch/issues/28753) The header files on directories`math` and `threadpool` should be included on the built package because they are included on the other header files, such as on file `torch/include/caffe2/utils/math.h` ``` #include "caffe2/core/common.h" #include "caffe2/core/types.h" #include "caffe2/utils/math/broadcast.h" #include "caffe2/utils/math/elementwise.h" #include "caffe2/utils/math/reduce.h" #include "caffe2/utils/math/transpose.h" #include "caffe2/utils/math/utils.h" ``` But the `setup.py` doesn't include the header files on `master` branch. The header files on `utils` directory of a built `torch` package are the following: ``` > ls include/caffe2/utils bench_utils.h conversions.h eigen_utils.h map_utils.h murmur_hash3.h proto_wrap.h smart_tensor_printer.h cast.h cpuid.h filler.h math-detail.h proto_convert.h signal_handler.h string_utils.h cblas.h cpu_neon.h fixed_divisor.h math.h proto_utils.h simple_queue.h zmq_helper.h ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/28869 Differential Revision: D18226319 Pulled By: soumith fbshipit-source-id: 51575ddc559181c069b3324aa9b2d1669310ba25	2019-10-30 11:11:15 -07:00
Wanchao Liang	4beaf1cf1c	add typing runtime dependency for py2 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28442 Test Plan: Imported from OSS Differential Revision: D18075498 fbshipit-source-id: 075f63b1ed2c83d9a64eb81224e0d67c6a63b22c	2019-10-22 22:02:08 -07:00
Hong Xu	a5354adb08	Eliminate the use of CUDA_HOME in setup.py. (#28373 ) Summary: Variables read from CMakeCache.txt are more reliable. Close https://github.com/pytorch/pytorch/issues/28365 Pull Request resolved: https://github.com/pytorch/pytorch/pull/28373 Differential Revision: D18061855 Pulled By: ezyang fbshipit-source-id: c550a365e23464411d75eca167f7e6e053f94872	2019-10-22 14:04:54 -07:00
Rohan Varma	badb08d577	Add clip_grad_norm_ to c++ api (#26140 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26140 Per https://github.com/pytorch/pytorch/issues/25883, we want to work towards C++/Python API parity. This diff adds clip_grad_norm_ to the c++ API to improve parity. ghstack-source-id: 91334333 ghstack-source-id: 91334333 Test Plan: Added a unit test Differential Revision: D17312367 fbshipit-source-id: 753ba3a4d084d01f3cc8919da3108e67c809ad65	2019-10-04 13:50:36 -07:00
Hong Xu	081069e8ca	Remove CUDA_VERSION from Python script (which has already been detected in CMake) (#27316 ) Summary: (Intentionally left blank) Pull Request resolved: https://github.com/pytorch/pytorch/pull/27316 Differential Revision: D17762715 Pulled By: ezyang fbshipit-source-id: 044c0ea6e8c2d12912c946a9a50b934b5253d8c8	2019-10-04 07:49:57 -07:00
Pavel Belevich	493c900810	Extract version to version.txt (#27149 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27149 Extract version to version.txt and add reading version logic to setup.py and fb/torch_version.py ghstack-source-id: 91271883 Test Plan: N/A Reviewed By: gchanan, ezyang Differential Revision: D17689307 fbshipit-source-id: 21899502027cec71b63d9dc151e09ff5ff3f279d	2019-10-03 12:13:15 -07:00
Hong Xu	5e5cbceeba	remove tools/setup_helpers/cudnn.py (#25876 ) Summary: FindCUDNN.cmake and cuda.cmake have done the detection. This commit deletes `tools/setup_helpers/cudnn.py` as it is no longer needed. Previously in https://github.com/pytorch/pytorch/issues/25482, one test failed because TensorRT detects cuDNN differently, and there may be situations we can find cuDNN but TensorRT cannot. This is fixed by passing our detection result down to TensorRT. Pull Request resolved: https://github.com/pytorch/pytorch/pull/25876 Differential Revision: D17346270 Pulled By: ezyang fbshipit-source-id: c1e7ad4a1cb20f964fe07a72906f2f002425d894	2019-09-24 07:44:33 -07:00
Sebastian Messmer	ed207b53ab	c10::KernelFunction (#26337 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26337 - Factor out boxing and unboxing functionality from the c10 dispatcher into a c10::KernelFunction class - Move that class and everything else it depends on into ATen/core/boxing - This also allows us to get rid of c10::KernelCache. Instead, we now store a pointer to the unboxed functor in c10::KernelFunction. - We're also getting rid of the DispatchTableEntry struct and instead store KernelFunction directly. - To make this work, we need to change the dispatcher calling API from Dispatcher::lookup().callBoxed/callUnboxed and OperatorEntry::lookup().callBoxed/callUnboxed to Dispatcher::callBoxed/callUnboxed and OperatorEntry::callBoxed/callUnboxed. ghstack-source-id: 90459911 Test Plan: unit tests Differential Revision: D17416607 fbshipit-source-id: fd221f1d70eb3f1b4d33092eaa7e37d25684c934	2019-09-20 18:55:25 -07:00
Will Feng	57a4b7c55d	Re-organize C++ API `torch::nn` folder structure (#26262 ) Summary: This PR aims to re-organize C++ API `torch::nn` folder structure in the following way: - Every module in `torch/csrc/api/include/torch/nn/modules/` (except `any.h`, `named_any.h`, `modulelist.h`, `sequential.h`, `embedding.h`) has a strictly equivalent Python file in `torch/nn/modules/`. For example: `torch/csrc/api/include/torch/nn/modules/pooling.h` -> `torch/nn/modules/pooling.py` `torch/csrc/api/include/torch/nn/modules/conv.h` -> `torch/nn/modules/conv.py` `torch/csrc/api/include/torch/nn/modules/batchnorm.h` -> `torch/nn/modules/batchnorm.py` `torch/csrc/api/include/torch/nn/modules/sparse.h` -> `torch/nn/modules/sparse.py` - Containers such as `any.h`, `named_any.h`, `modulelist.h`, `sequential.h` are moved into `torch/csrc/api/include/torch/nn/modules/container/`, because their implementations are too long to be combined into one file (like `torch/nn/modules/container.py` in Python API) - `embedding.h` is not renamed to `sparse.h` yet, because we have another work stream that works on API parity for Embedding and EmbeddingBag, and renaming the file would cause conflict. After the embedding API parity work is done, we will rename `embedding.h` to `sparse.h` to match the Python file name, and move the embedding options out to options/ folder. - `torch/csrc/api/include/torch/nn/functional/` is added, and the folder structure mirrors that of `torch/csrc/api/include/torch/nn/modules/`. For example, `torch/csrc/api/include/torch/nn/functional/pooling.h` contains the functions for pooling, which are then used by the pooling modules in `torch/csrc/api/include/torch/nn/modules/pooling.h`. - `torch/csrc/api/include/torch/nn/options/` is added, and the folder structure mirrors that of `torch/csrc/api/include/torch/nn/modules/`. For example, `torch/csrc/api/include/torch/nn/options/pooling.h` contains MaxPoolOptions, which is used by both MaxPool modules in `torch/csrc/api/include/torch/nn/modules/pooling.h`, and max_pool functions in `torch/csrc/api/include/torch/nn/functional/pooling.h`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/26262 Differential Revision: D17422426 Pulled By: yf225 fbshipit-source-id: c413d2a374ba716dac81db31516619bbd879db7f	2019-09-17 10:07:29 -07:00
Ailing Zhang	079cd4e1fc	Remove requests as dependency (#26083 ) Summary: local build is slow... test in CI... Pull Request resolved: https://github.com/pytorch/pytorch/pull/26083 Differential Revision: D17346949 Pulled By: ailzhang fbshipit-source-id: f552d1a4be55ad4e2bd915af7c5a2c1b6667c446	2019-09-13 08:39:53 -07:00
Hong Xu	8a026d4f74	Remove tools/setup_helpers/dist_check.py (#25879 ) Summary: What dist_check.py does is largely merely determining whether we should use set "USE_IBVERBS" to ON or OFF when the user sets "USE_GLOO_IBVERBS" to ON. But this is unnecessary, because this complicated determination will always be overrided by gloo: `2101e02cea/cmake/Dependencies.cmake (L19-L28)` Since dist_check.py becomes irrelevant, this commit also simplifies the setting of `USE_DISTRIBUTED` (by removing its explicit setting in Python scripts), and deprecate `USE_GLOO_IBVERBS` in favor of `USE_IBVERBS`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/25879 Differential Revision: D17282395 Pulled By: pietern fbshipit-source-id: a10735f50728d89c3d81fd57bcd26764e7f84dd1	2019-09-10 04:33:28 -07:00
Edward Yang	97b432bdf0	Back out "[pytorch][PR] remove tools/setup_helpers/cudnn.py" Summary: Original commit changeset: abd9cd0244ca (Note: this ignores all push blocking failures!) Test Plan: none Reviewed By: nairbv Differential Revision: D17259003 fbshipit-source-id: d7e067eeb36192766c639bfcbc66f540ce8eb77e	2019-09-09 06:47:45 -07:00
Hong Xu	66ac6698f6	remove tools/setup_helpers/cudnn.py (#25482 ) Summary: FindCUDNN.cmake and cuda.cmake have done the detection. This commit deletes `tools/setup_helpers/cudnn.py` as it is no longer needed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/25482 Differential Revision: D17226408 Pulled By: ezyang fbshipit-source-id: abd9cd0244cabea1f5d9f93f828d632d77c8dd5e	2019-09-06 06:54:35 -07:00
Pieter Noordhuis	3556bea5aa	Build torch.distributed with Gloo backend on macOS (#25260 ) Summary: In facebookincubator/gloo#212, a libuv based Gloo transport was introduced, which allows us to use Gloo on macOS (and later perhaps also Windows). This commit updates CMake code to enable building with USE_DISTRIBUTED=1 on macOS. A few notes: * The Caffe2 ops are not compiled, for they depend on `gloo::transport::tcp`. * The process group implementation uses `gloo::transport::tcp` on Linux (because of `epoll(2)` on Linux and `gloo::transport::uv` on macOS). * The TCP store works but sometimes crashes on process termination. * The distributed tests are not yet run. * The nightly builds don't use `USE_DISTRIBUTED=1`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/25260 Reviewed By: mrshenli Differential Revision: D17202381 Pulled By: pietern fbshipit-source-id: ca80a82e78a05b4154271d2fb0ed31c8d9f26a7c	2019-09-05 07:09:50 -07:00
James Reed	f71ddd4292	Switch hub to use `requests` because of SSL (#25083 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25083 I missed this in the last PR Test Plan: Imported from OSS Differential Revision: D17005372 Pulled By: jamesr66a fbshipit-source-id: 1200a6cd88fb9051aed8baf3162a9f8ffbf65189	2019-08-24 12:06:49 -07:00
Hong Xu	1a9334ea59	Hotpatch CXXFLAGS to be the same as CFLAGS if CXXFLAGS is not set. (#23568 ) Summary: This fixes build regression caused by https://github.com/pytorch/pytorch/issues/23528 because we used to let CXXFLAGS equal CFLAGS. cc suo Pull Request resolved: https://github.com/pytorch/pytorch/pull/23568 Differential Revision: D16568820 Pulled By: suo fbshipit-source-id: 64a0dc923c08ac1751224f42bc4ccdc707341762	2019-08-07 16:25:57 -07:00
Hugo	0f5d071d52	Add python_requires to help pip (#23863 ) Summary: `python_requires` helps the installer choose the correct version of this package for the user's running Python. This is especially necessary when dropping Python 2 (https://github.com/pytorch/pytorch/issues/23795) but is useful now too. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23863 Differential Revision: D16692908 Pulled By: soumith fbshipit-source-id: 3c9ba2eb1d1cf12763d6284daa4f18f605abb373	2019-08-07 12:47:53 -07:00
Edward Yang	a1d945b295	Roll master to 1.3.0 (#23895 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23895 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D16688489 Pulled By: ezyang fbshipit-source-id: a56d0180a0bc57775badd9e31ea3d441d5fd4f88	2019-08-07 08:44:32 -07:00
Soumith Chintala	6313d5e28b	add appropriate install_requires (#23722 ) Summary: This adds: - dependency on numpy if compiled with numpy support - dependency on future if python <= 2.7 Fixes https://github.com/pytorch/pytorch/issues/23670 Pull Request resolved: https://github.com/pytorch/pytorch/pull/23722 Differential Revision: D16643824 Pulled By: soumith fbshipit-source-id: 5cf4d79cd188678cb2328c4286eabd52a2a86fcd	2019-08-04 17:24:19 -07:00
Soumith Chintala	dded794eeb	add setup metadata to help PyPI flesh out content on pypi package page (#22085 ) Summary: add setup metadata to help PyPI flesh out content on pypi package page. Apparently this might help flesh out the "Used By" feature according to driazati Pull Request resolved: https://github.com/pytorch/pytorch/pull/22085 Differential Revision: D16604703 Pulled By: soumith fbshipit-source-id: ddb4f7ba7c24fdf718260aed28cc7bc9afb46de9	2019-08-01 12:15:56 -07:00
Ilia Cherniavskii	74f8094ea5	Rename threading build options Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23407 Test Plan: USE_CUDA=0 ATEN_THREADING=TBB USE_OPENMP=0 USE_TBB=1 MKL_THREADING=TBB BLAS=MKL USE_MKLDNN=1 MKLDNN_THREADING=TBB BUILD_BINARY=1 python setup.py develop install --cmake ./build/bin/parallel_info Imported from OSS Differential Revision: D16522538 Pulled By: ilia-cher fbshipit-source-id: 75c4761d93a7f5936f28e4c5eedcd27d8490d0c5	2019-07-26 13:09:14 -07:00
Hong Xu	82545ecc71	Specify build dir as a global variable in BUILD_DIR in the build system. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23318 Test Plan: Imported from OSS Differential Revision: D16493987 Pulled By: ezyang fbshipit-source-id: 497e9dd924280f61dde095b4f2b50f5402d9da97	2019-07-25 07:19:47 -07:00
Hong Xu	fd1d06e317	Let Python build scripts accept both CMAKE_BUILD_TYPE and the oldschool DEBUG and REL_WITH_DEB_INFO variables. (#22875 ) Summary: Currently the build type is decided by the environment variable DEBUG and REL_WITH_DEB_INFO. This commit also lets CMAKE_BUILD_TYPE be effective. This makes the interface more consistent with CMake. This also prepares https://github.com/pytorch/pytorch/issues/22776. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22875 Differential Revision: D16281663 Pulled By: ezyang fbshipit-source-id: 952f92aad85ff59f1c7abe8256eca8a4a0936026	2019-07-24 08:07:47 -07:00
Hong Xu	60c46dd4df	Let CMake handle NCCL detection instead of our handcrafted Python script. (#22930 ) Summary: --- How does the current code subsume all detections in the deleted `nccl.py`? - The dependency of `USE_NCCL` on the OS and `USE_CUDA` is handled as dependency options in `CMakeLists.txt`. - The main NCCL detection happens in [FindNCCL.cmake](`8377d4b32c/cmake/Modules/FindNCCL.cmake`), which is called by [nccl.cmake](`8377d4b32c/cmake/External/nccl.cmake`). When `USE_SYSTEM_NCCL` is false, the previous Python code defer the detection to `find_package(NCCL)`. The change in `nccl.cmake` retains this. - `USE_STATIC_NCCL` in the previous Python code simply changes the name of the detected library. This is done in `IF (USE_STATIC_NCCL)`. - Now we only need to look at how the lines below line 20 in `nccl.cmake` are subsumed. These lines list paths to header and library directories that NCCL headers and libraries may reside in and try to search these directories for the key header and library files in turn. These are done by `find_path` for headers and `find_library` for the library files in `FindNCCL.cmake`. * The call of [find_path](https://cmake.org/cmake/help/v3.8/command/find_path.html) (Search for `NO_DEFAULT_PATH` in the link) by default searches for headers in `<prefix>/include` for each `<prefix>` in `CMAKE_PREFIX_PATH` and `CMAKE_SYSTEM_PREFIX_PATH`. Like the Python code, this commit sets `CMAKE_PREFIX_PATH` to search for `<prefix>` in `NCCL_ROOT_DIR` and home to CUDA. `CMAKE_SYSTEM_PREFIX_PATH` includes the standard directories such as `/usr/local` and `/usr`. `NCCL_INCLUDE_DIR` is also specifically handled. * Similarly, the call of [find_library](https://cmake.org/cmake/help/v3.8/command/find_library.html) (Search for `NO_DEFAULT_PATH` in the link) by default searches for libraries in directories including `<prefix>/lib` for each `<prefix>` in `CMAKE_PREFIX_PATH` and `CMAKE_SYSTEM_PREFIX_PATH`. But it also handles the edge cases intended to be solved in the Python code more properly: - It only searches for `<prefix>/lib64` (and `<prefix>/lib32`) if it is appropriate on the system. - It only searches for `<prefix>/lib/<arch>` for the right `<arch>`, unlike the Python code searches for `lib/<arch>` in a generic way (e.g., the Python code searches for `/usr/lib/x86_64-linux-gnu` but in reality systems have `/usr/lib/x86_64-some-customized-name-linux-gnu`, see https://unix.stackexchange.com/a/226180/38242 ). --- Regarding for relevant issues: - https://github.com/pytorch/pytorch/issues/12063 and https://github.com/pytorch/pytorch/issues/2877: These are properly handled, as explained in the updated comment. - https://github.com/pytorch/pytorch/issues/2941 does not changes NCCL detection specifically for Windows (it changed CUDA detection). - `b7e258f81e` A versioned library detection is added, but the order is reversed: The unversioned library becomes preferred. This is because normally unversioned libraries are linked to versioned libraries and preferred by users, and local installation by users are often unversioned. Like the document of [find_library](https://cmake.org/cmake/help/v3.8/command/find_library.html) suggests: > When using this to specify names with and without a version suffix, we recommend specifying the unversioned name first so that locally-built packages can be found before those provided by distributions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22930 Differential Revision: D16440275 Pulled By: ezyang fbshipit-source-id: 11fe80743d4fe89b1ed6f96d5d996496e8ec01aa	2019-07-23 08:45:51 -07:00
Edward Yang	798d5d9771	Revert D16281714: Add sanity checks for NCCL detection. Differential Revision: D16281714 Original commit changeset: 396bcbf099bd fbshipit-source-id: a22cc112d1b6a62d689f9d8a7f93e8be3abe2a44	2019-07-16 13:58:27 -07:00
Hong Xu	e2046f8c1d	Add sanity checks for NCCL detection. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22819 Test Plan: Imported from OSS Differential Revision: D16281714 Pulled By: ezyang fbshipit-source-id: 396bcbf099bd07b996cf779c6b43092096b52d90	2019-07-16 11:32:32 -07:00
Hui Wu	07ef85e326	Add USE_MKLDNN_CBLAS build option. (#19014 ) Summary: MKL-DNN is the main library for computation when we use ideep device. It can use kernels implemented by different algorithms (including JIT, CBLAS, etc.) for computation. We add the "USE_MKLDNN_CBLAS" (default OFF) build option so that users can decide whether to use CBLAS computation methods or not. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19014 Differential Revision: D16094090 Pulled By: ezyang fbshipit-source-id: 3f0b1d1a59a327ea0d1456e2752f2edd78d96ccc	2019-07-02 12:29:54 -07:00
Hong Xu	b9ede6600e	Remove the USE_MIOPEN build option as MIOpen is always used when built with ROCm. (#22420 ) Summary: Close https://github.com/pytorch/pytorch/issues/22200 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22420 Differential Revision: D16087538 Pulled By: bddppq fbshipit-source-id: ecf3e7eb8213bb093e1c5290d096c233284a2ff9	2019-07-02 00:05:59 -07:00
Jon Malmaud	bfeff1eb8f	Stubs for torch.nn (#19089 ) Summary: Closes https://github.com/pytorch/pytorch/issues/18724 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19089 Differential Revision: D16073654 Pulled By: ezyang fbshipit-source-id: 5642179651ce45ab7c5a46cc1fcc4fd6b37fa71c	2019-07-01 09:50:17 -07:00
Pieter Noordhuis	6ff0c6ca3f	Remove THD (#22065 ) Summary: It's been ~9 months since moving THD to the `torch.distributed.deprecated` namespace (see https://github.com/pytorch/pytorch/issues/11405) and we haven't seen issues related to it, so it's time to remove it. Closes https://github.com/pytorch/pytorch/issues/18967. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22065 Reviewed By: mrshenli Differential Revision: D15983669 Pulled By: pietern fbshipit-source-id: 2a2f5866f9a63040bc7cef3956d5fd215aba7165	2019-06-25 12:19:13 -07:00
Ilia Cherniavskii	6350dbddd1	Fix sequential MKL case (#22062 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22062 ghimport-source-id: a30255d7453c4ffecf40215a785c1e06b7296368 Test Plan: USE_CUDA=0 PARALLEL_BACKEND=OPENMP BLAS=MKL USE_MKLDNN=1 MKL_SEQ=1 MKLDNN_THREADING=SEQ BUILD_BINARY=1 python setup.py develop --cmake ./build/bin/parallel_info Imported from OSS Differential Revision: D15938079 Pulled By: ilia-cher fbshipit-source-id: e7ef0c5bc75ebb845ebe66bf76a4070d45305b35	2019-06-24 12:56:43 -07:00
Hong Xu	0408697317	Followup cleanup in cmake.py and add a comment in setup.py (#21792 ) Summary: Following up `b811b6d5c0` * Use property instead of __setattr__ in CMake. * Add a comment clarifying when built_ext.run is called. --- cc ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/21792 Differential Revision: D15860606 Pulled By: umanwizard fbshipit-source-id: ba1fa07f58d4eac81ac27fa9dc7115d1cdd3dec0	2019-06-17 13:46:25 -07:00
Hong Xu	b811b6d5c0	When building extensions, honor options set in CMake. (#21653 ) Summary: Currently when building extensions, variables such as USE_CUDA, USE_CUDNN are used to determine what libraries should be linked. But we should use what CMake has detected, because: 1. If CMake found them unavailable but the variables say some libraries should be linked, the build would fail. 2. If the first build is made using a set of non-default build options, rebuild must have these option passed to setup.py again, otherwise the extension build process is inconsistent with CMake. For example, ```bash # First build USE_CUDA=0 python setup.py install # Subsequent builds like this would fail, unless "build/" is deleted python setup.py install ``` This commit addresses the above issues by using variables from CMakeCache.txt when building the extensions. --- The changes in `setup.py` may look lengthy, but the biggest changed block is mostly moving them into a function `configure_extension_build` (along with some variable names changed to `cmake_cache_vars['variable name']` and other minor changes), because it must be called after CMake has been called (and thus the options used and system environment detected by CMake become available). Pull Request resolved: https://github.com/pytorch/pytorch/pull/21653 Differential Revision: D15824506 Pulled By: ezyang fbshipit-source-id: 1e1eb7eec7debba30738f65472ccad966ee74028	2019-06-14 08:13:40 -07:00
Ilia Cherniavskii	5485f09f18	Native TBB parallel backend (#20480 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20480 ghimport-source-id: c710f897c4c9b9616fc3dd76d80b4845aea43a1f Differential Revision: D15333692 Pulled By: ilia-cher fbshipit-source-id: 61e476dd5c737fe144e3aec000d8ebb11fbc0547	2019-06-13 10:11:16 -07:00
Karl Ostmo	49481d576d	Torch rename (#20774 ) Summary: This renames the CMake `caffe2` target to `torch`, as well as renaming `caffe2_gpu` to `torch_gpu` (and likewise for other gpu target variants). Many intermediate variables that don't manifest as artifacts of the build remain for now with the "caffe2" name; a complete purge of `caffe2` from CMake variable names is beyond the scope of this PR. The shell `libtorch` library that had been introduced as a stopgap in https://github.com/pytorch/pytorch/issues/17783 is again flattened in this PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20774 Differential Revision: D15769965 Pulled By: kostmo fbshipit-source-id: b86e8c410099f90be0468e30176207d3ad40c821	2019-06-12 20:12:34 -07:00
Hong Xu	646a7f99bb	Move management of calls of "cmake --build" to setup_helper/cmake.py and refactor as a CMake class Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21493 Differential Revision: D15759279 Pulled By: ezyang fbshipit-source-id: 157e1de36f1c5a51caf2a25b363a94369c442012	2019-06-11 07:04:05 -07:00
Hong Xu	240d62fbaa	Move redundant code that checks NumPy during build to a helper module and add an option to disable building with NumPy Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21417 Reviewed By: ezyang Differential Revision: D15694357 Pulled By: fmassa fbshipit-source-id: bc1bda23349ba4531f19619fa4adecb846225c20	2019-06-06 08:15:19 -07:00
Hong Xu	9a989ec469	Add an option to stop the build process once cmake terminates. (#21034 ) Summary: Add an option to setup.py to stop the build process once cmake terminates. This leaves users a chance to fine adjust build options. Also update README accordingly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21034 Differential Revision: D15530096 Pulled By: soumith fbshipit-source-id: 71ac6ff8483c3ee77c38d88f0d059db53a7d3901	2019-05-28 17:11:00 -07:00
Ilia Cherniavskii	580eab6562	Restore TBB module (#20454 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20454 ghimport-source-id: 14aca1dedbe647d41e55e7538a6b7eeab0fc4384 Differential Revision: D15326062 Pulled By: ilia-cher fbshipit-source-id: 02b005a679b10dc7a264978e87a8d2bb98ab972f	2019-05-28 02:49:36 -07:00
Ilia Cherniavskii	82aecfad6a	Native ATen/Parallel backend (#20087 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20087 ghimport-source-id: bcfc8a86abe0893e4a380fe6f6123e2082ba4317 Differential Revision: D15248663 Pulled By: ilia-cher fbshipit-source-id: fdb7a8860c85d8202026b629cb7fa344782bd2c4	2019-05-28 01:40:54 -07:00
Hong Xu	1e8f129a05	In setup.py, also check some submodules of submodules. (#20937 ) Summary: Sometimes users forget using the "--recursive" option when they update submodules. This added check should help expose this issue. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20937 Differential Revision: D15502846 Pulled By: mrshenli fbshipit-source-id: 34c28a2c71ee6442d16b8b741ea44a18733b1536	2019-05-26 18:43:24 -07:00
Gregory Chanan	47043220ee	Update version strings to 1.2 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20812 Differential Revision: D15451892 Pulled By: gchanan fbshipit-source-id: 07355dbd446053a69b5cf4e3be1842aa1075c71f	2019-05-24 11:07:29 -07:00
Ilia Cherniavskii	c3d05e86cc	Resend "Split ATen/Parallel into interface and backend" (#20825 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20825 ghimport-source-id: 0371fbd37cb37635647d473d5ac9f2859e787061 Differential Revision: D15458073 Pulled By: ilia-cher fbshipit-source-id: cd27d0da1691f6be1183cd152348ac0d93a53996	2019-05-24 02:03:06 -07:00
Hong Xu	795a1a6ffa	When detecting numpy, assign relavant variables outside the try block (#20739 ) Summary: When detecting the presence of NumPy using import, move numpy-related variable assignments outside the try block (i.e., to an else block) to improve readability. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20739 Differential Revision: D15453916 Pulled By: ezyang fbshipit-source-id: d3c37f2b290846be3c6a1462251cbb3e95d493be	2019-05-22 11:27:36 -07:00
Edward Yang	fd95947e68	Revert D15248618: Split ATen/Parallel into interface and backend Differential Revision: D15248618 Original commit changeset: 060879266bc8 fbshipit-source-id: fc5cbb030b87613c9e15100118c3d4a064097c20	2019-05-22 09:55:51 -07:00
Ilia Cherniavskii	c4a3b4d528	Split ATen/Parallel into interface and backend (#20057 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20057 ghimport-source-id: c583f61bf661c994eb4d0625748a299e892a7246 Differential Revision: D15248618 Pulled By: ilia-cher fbshipit-source-id: 060879266bc8616916fe220adef6ae6c0b076fbd	2019-05-21 19:15:47 -07:00
Ilia Cherniavskii	481b6d0268	Allow a non-OpenMP based build (#19749 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19749 ghimport-source-id: a6636c0acddbdc5fd5b0dcb20b9f80cbdb9159b9 Differential Revision: D15141993 Pulled By: ilia-cher fbshipit-source-id: 96085608398b2a4c97c68b2948f5184d07f9ad3d	2019-05-06 19:34:48 -07:00
Bram Wasti	035966d538	Add options to Operator to enable registration of alias analysis passes (#19382 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19382 ghimport-source-id: aeaad3b84ea20dd95b38635ca28c5ff657187909 Differential Revision: D14990873 Pulled By: bwasti fbshipit-source-id: e1292ac8358ca8ff5bad8d8aeaddf06c23e66067	2019-05-06 15:40:13 -07:00
Jon Malmaud	0565141728	Type annotations for `util.data`. (#18963 ) Summary: I haven't had a chance to rigorously try these out yet so don't merge yet. Closes #18725. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18963 Differential Revision: D14832897 Pulled By: ezyang fbshipit-source-id: 4780e7a34126bc66ddbfd9d808dfc9e0edd77e68	2019-04-08 09:52:53 -07:00
Jon Malmaud	1b25fdbcd0	More type stubs (#18511 ) Summary: Added stubs for: * The `device` module * The `cuda` module * Parts of the `optim` module * Began adding stubs for the `autograd` module. I'll annotate more later but `no_grad` and friends are probably the most used exports from it so it seemed like a good place to start. This would close #16996, although comments on that issue reference other missing stubs so maybe it's worth keeping open as an umbrella issue. The big remaining missing package is `nn`. Also added a `py.typed` file so mypy will pick up on the type stubs. That closes #17639. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18511 Differential Revision: D14715053 Pulled By: ezyang fbshipit-source-id: 9e4882ac997063650e6ce47604b3eaf1232c61c9	2019-04-01 16:03:58 -07:00
Shuichi KITAGUCHI	ddbfdc911d	Create torch/lib directory before copying _C.lib on Windows environment. (#18666 ) Summary: `python setup.py develop` fails with following messages. ~~~ ... -- Building with NumPy bindings -- Not using cuDNN -- Not using MIOpen -- Not using CUDA -- Using MKLDNN -- Not using NCCL -- Building without distributed package Copying extension caffe2.python.caffe2_pybind11_state Copying caffe2.python.caffe2_pybind11_state from torch\Lib\site-packages\caffe2\python\caffe2_pybind11_state.cp37-win_amd64.pyd to C:\data\source\pytorch\build\lib.win-amd64-3.7\caffe2\python\caffe2_pybind11_state.cp37-win_amd64.pyd copying torch\Lib\site-packages\caffe2\python\caffe2_pybind11_state.cp37-win_amd64.pyd -> C:\data\source\pytorch\build\lib.win-amd64-3.7\caffe2\python building 'torch._C' extension creating build\temp.win-amd64-3.7 creating build\temp.win-amd64-3.7\Release creating build\temp.win-amd64-3.7\Release\torch creating build\temp.win-amd64-3.7\Release\torch\csrc ... creating C:\data\source\pytorch\build\lib.win-amd64-3.7\torch C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.16.27023\bin\HostX64\x64\link.exe /nologo /INCREMENTAL:NO /LTCG /nodefaultlib:libucrt.lib ucrt.lib /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\data\source\pytorch\torch\lib /LIBPATH:C:\data\dlenv\libs /LIBPATH:C:\data\dlenv\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.16.27023\ATLMFC\lib\x64" "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.16.27023\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\NETFXSDK\4.6.1\lib\um\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.17763.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.17763.0\um\x64" shm.lib torch_python.lib /EXPORT:PyInit__C build\temp.win-amd64-3.7\Release\torch/csrc/stub.obj /OUT:build\lib.win-amd64-3.7\torch\_C.cp37-win_amd64.pyd /IMPLIB:build\temp.win-amd64-3.7\Release\torch/csrc\_C.cp37-win_amd64.lib /NODEFAULTLIB:LIBCMT.LIB ライブラリ build\temp.win-amd64-3.7\Release\torch/csrc\_C.cp37-win_amd64.lib とオブジェクト build\temp.win-amd64-3.7\Release\torch/csrc\_C.cp37-win_amd64.exp を作成中コード生成しています。コード生成が終了しました。 copying build\lib.win-amd64-3.7\torch\_C.cp37-win_amd64.pyd -> torch copying build\lib.win-amd64-3.7\caffe2\python\caffe2_pybind11_state.cp37-win_amd64.pyd -> caffe2\python copying build/temp.win-amd64-3.7/Release/torch/csrc/_C.cp37-win_amd64.lib -> build/lib.win-amd64-3.7/torch/lib/_C.lib error: could not create 'build/lib.win-amd64-3.7/torch/lib/_C.lib': No such file or directory ~~~ When `python setup.py install` is executed, `torch/lib` has been created by previous process (copying many files) and this copy succeeds. But in develop mode, that process does not executed and this copy fails. This patch creates `torch/lib` directory if do not exist. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18666 Differential Revision: D14704269 Pulled By: ezyang fbshipit-source-id: b2d7c698a906b945bf34bb78f17b91b4fdfd3294	2019-04-01 07:28:08 -07:00
Edward Yang	173f224570	Turn on F401: Unused import warning. (#18598 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18598 ghimport-source-id: c74597e5e7437e94a43c163cee0639b20d0d0c6a Stack from [ghstack](https://github.com/ezyang/ghstack): * #18598 Turn on F401: Unused import warning. This was requested by someone at Facebook; this lint is turned on for Facebook by default. "Sure, why not." I had to noqa a number of imports in __init__. Hypothetically we're supposed to use __all__ in this case, but I was too lazy to fix it. Left for future work. Be careful! flake8-2 and flake8-3 behave differently with respect to import resolution for # type: comments. flake8-3 will report an import unused; flake8-2 will not. For now, I just noqa'd all these sites. All the changes were done by hand. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14687478 fbshipit-source-id: 30d532381e914091aadfa0d2a5a89404819663e3	2019-03-30 09:01:17 -07:00
Gao, Xiang	a40e0a7f2d	Add torch.version.git_version (#18299 ) Summary: Fixes: https://github.com/pytorch/pytorch/issues/18293 cc: colesbury Pull Request resolved: https://github.com/pytorch/pytorch/pull/18299 Differential Revision: D14611972 Pulled By: soumith fbshipit-source-id: cdb48ef37c8869713a9a43ea0da08e1bed9279a2	2019-03-25 19:59:40 -07:00
Sebastian Messmer	daa77c6e26	Move schema inference to c10 (#18090 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18090 This schema inference is needed by the c10 operator registration mechanism. Move it to c10. It is going to be used by diffs stacked on top. Reviewed By: ezyang Differential Revision: D14491454 fbshipit-source-id: 0f8ddcdbd91467c8347d315dd443a1ca8b216481	2019-03-21 14:57:30 -07:00
peter	906f9efc57	Revert "Add check for x64 Python before setup (#17707 )" (#17864 ) Summary: This reverts commit `08fb9021da`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17864 Differential Revision: D14404920 Pulled By: soumith fbshipit-source-id: d41fc06e249f3437d4f80d1d6a5fdbd44c90462b	2019-03-11 08:52:13 -07:00
peter	08fb9021da	Add check for x64 Python before setup (#17707 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/17657. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17707 Differential Revision: D14346705 Pulled By: ezyang fbshipit-source-id: 5daafacdb99eb9a9c6517263d10f20c79f920d24	2019-03-06 10:48:16 -08:00
Lu Fang	9e08c998db	Throw exception when foxi is not checked out (#17477 ) Summary: Add check and provide useful warning/error information to user if foxi is not checked out. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17477 Reviewed By: zrphercule Differential Revision: D14212896 Pulled By: houseroad fbshipit-source-id: 557247d5d8fdc016b1c24c2a21503e59f874ad09	2019-02-25 14:39:24 -08:00
Vishwak Srinivasan	9e69703dac	USE_ --> BUILD_ for CAFFE2_OPS and TEST (#17390 ) Differential Revision: D14195572 Pulled By: soumith fbshipit-source-id: 28e4ff3fe03a151cd4ed014c64253389cb85de3e	2019-02-22 17:19:44 -08:00
Zachary DeVito	356a94b64e	Lazily load libcuda libnvrtc from c++ (#17317 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/16860 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17317 Differential Revision: D14157877 Pulled By: zdevito fbshipit-source-id: c37aec2d77c2e637d4fc6ceffe2bd32901c70317	2019-02-22 13:51:45 -08:00
Soumith Chintala	3069c45069	upgrade documentation in setup.py to NO_ -> USE_ (#17333 ) Summary: fixes https://github.com/pytorch/pytorch/issues/17265 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17333 Differential Revision: D14168483 Pulled By: soumith fbshipit-source-id: a79f4f9d9e18cb64e2f56f777caa69ae92d2fa4b	2019-02-21 10:25:43 -08:00
Tri Dao	37890610b0	Include vec256 headers in setup.py (#17220 ) Summary: Fix #16650. Headers such as `ATen/cpu/vml.h` contain `#include <ATen/cpu/vec256/vec256.h>` for example, but these vec256 headers aren't included, due to commit `e4c0bb1`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17220 Differential Revision: D14165695 Pulled By: ezyang fbshipit-source-id: 27b2aa2a734b3719ca4af0565f79623b64b2620f	2019-02-21 07:37:01 -08:00
Elias Ellison	89df22e57b	Lightweight String check Utility (#16858 ) Summary: light weight implementation of LLVM filecheck utility. Currently only handles string matching - regexes & saving a regex to a variable name can be added as needed. Current intended usage is through FileCheckBuilder python handle, and is shown in the tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16858 Differential Revision: D14096244 Pulled By: eellison fbshipit-source-id: c7c8d1457691c105e6ccbb3c1a378d96baac2569	2019-02-19 12:31:57 -08:00
Dmytro Dzhulgakov	5a26579e27	Add more headers to setup.py to make pytorch/benchmark work (#16890 ) Summary: Since we don't do tmp_install any more it's better to include all necessary headers. cc kostmo for better suggestions of how to list all headers here Pull Request resolved: https://github.com/pytorch/pytorch/pull/16890 Differential Revision: D14079848 Pulled By: dzhulgakov fbshipit-source-id: 4522c80d05e5d91f99f6700cde46cac559330d28	2019-02-13 23:14:36 -08:00
Simeon Monov	bad4442a7c	Parse the command line and check the arguments before build_deps() (#16914 ) Summary: This is needed to check for wrong arguments or --help options before `build_deps()` is executed. Otherwise command line arguments are not parsed and checked until `setup()` is run. Fixes: #16707 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16914 Differential Revision: D14041236 Pulled By: soumith fbshipit-source-id: 41f635772ccf47f05114775d5a19ae04c495ab3b	2019-02-12 00:15:42 -08:00
Zachary DeVito	21193bf123	try to get rid of tmp_install (#16414 ) Summary: Rehash of previous attempts. This tries a different approach where we accept the install as specified in cmake (leaving bin/ include/ and lib/ alone), and then try to adjust the rest of the files to this more standard layout. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16414 Differential Revision: D13863635 Pulled By: zdevito fbshipit-source-id: 23725f5c64d7509bf3ca8f472dcdcad074de9828	2019-01-29 17:29:40 -08:00
Thomas Viehmann	6a6983ed7f	create type hint stub files for module torch (#12500 ) Summary: We have: - This is an initial stab at creating a type stub `torch/__init__.pyi` . - This is only tested on Python 3, since that's the only Python version mypy works on. - So far, we only aim at doing this for torch functions and torch.Tensor. - Quite a few methods and functions have to be typed manually. These are done in `torch/__init__.pyi.in` For me, PyCharm (the non-paid one) didn't seem to indicate errors in the .pyi when opening and seemed to be able to get the type hint for the few functions I tried, but I don't use PyCharm for my usual PyTorch activities, so I didn't extensively try this out. An example of a generated PYI is at [this gist](https://gist.github.com/ezyang/bf9b6a5fa8827c52152858169bcb61b1). Pull Request resolved: https://github.com/pytorch/pytorch/pull/12500 Differential Revision: D13695553 Pulled By: ezyang fbshipit-source-id: 4566c71913ede4e4c23ebc4a72c17151f94e8e21	2019-01-29 12:14:17 -08:00
Zachary DeVito	9477a5d9c8	Remove bash from build (#16289 ) Summary: This commit removes the dependency on `build_pytorch_libs.sh` by moving the remaining functionality that is not expressible in cmake into python. Removing the indirection through bash also removes over 300 lines of environment munging code that is incredibly hard to understand because it passes a lot of secret parameters through `os.env`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16289 Reviewed By: ezyang Differential Revision: D13821662 Pulled By: zdevito fbshipit-source-id: d658d26925e3b1169ac1e3d44a159cf8a1f0d9b1	2019-01-25 16:03:53 -08:00
Zachary DeVito	0cd1ab82b0	Remove dead code from setup.py, remove need for build target. (#16162 ) Summary: Now it is only necessary to use 'develop' or 'install' to build. Incremental cmake is on by default. `develop --cmake` forces it to rerun. The NinjaBuilder stuff is dead. It was used to make building _C.so faster but now _C.so is just an empty stub file. Removed a bunch of custom build commands from setup.py that are no longer meaningful now that cmake handles most of the build. Removed unused targets in build_pytorch_lib.sh/bat Pull Request resolved: https://github.com/pytorch/pytorch/pull/16162 Differential Revision: D13744155 Pulled By: zdevito fbshipit-source-id: d836484782c65b7f8e8c7a82620886f7a7777892	2019-01-21 17:27:56 -08:00
Zachary DeVito	b5c733324c	Fix RERUN_CMAKE Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16132 Differential Revision: D13726816 Pulled By: zdevito fbshipit-source-id: 26ad70651b0138642ad5240670f5c452018c13a2	2019-01-18 00:04:31 -08:00
Sebastian Messmer	3e85a2bcbf	Move c10 dispatcher back to ATen/core (#16050 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16050 The c10 dispatcher will (soon) depend on IValue and IValue can't be moved to c10 yet because it depends on at::Tensor, which depends on legacy Type dispatch and we don't want the legacy dispatch in c10. So instead, we move the c10 dispatcher back to ATen/core until we can actually move at::Tensor to c10. Reviewed By: ezyang Differential Revision: D13684517 fbshipit-source-id: 1125f4254223907c52f96ff73034f6d4ae9fd0a7	2019-01-17 15:56:52 -08:00
Jesse Hellemn	99b029aca3	Include all Caffe2 headers in Python installations (#16124 ) Summary: Confirmed on a local run that all the additional headers are present. This shouldn't be caught in any existing tests though. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16124 Differential Revision: D13720773 Pulled By: pjh5 fbshipit-source-id: 22a42639f5649cac555ecc5a8b6760a8cbfcf01f	2019-01-17 13:51:51 -08:00
peter	f7733526aa	Generate PDB files for better debugging on Windows (#16008 ) Summary: 1. Unify `build_pytorch_libs.bat`, `setup.py` and `torch/CMakeLists.txt` on the debugging flags with the `CMAKE_BUILD_TYPE` being `Debug`, `Release` and `RelWithDebInfo`. 2. Install PDBs through CMake if they are generated. Reference: 1. CMake PDB install: https://gitlab.kitware.com/cmake/cmake/issues/18393#note_459199 2. About debugging flags https://stackoverflow.com/a/4662345 3. MSDN page about /DEBUG flag: https://docs.microsoft.com/en-us/cpp/build/reference/debug-generate-debug-info?view=vs-2017 4. MSDN page about /Z{i/I/7}: https://docs.microsoft.com/en-us/cpp/build/reference/z7-zi-zi-debug-information-format?view=vs-2017 Work to do: - [x] Test the changes work in Release config through this PR - [ ] <del> Test debug build through https://github.com/pytorch/pytorch/pull/16009 </del> - [x] Test release build with debugging symbols through #16013 Difficulties: - [x] Replace /Zi flags with /Z7 (which will be added if DEBUG or RelWithDebInfo is used), as it is not supported by sccache - [x] Resolve `LINK : fatal error LNK1210: exceeded internal ILK size limit; link with /INCREMENTAL:NO` in the debug build - [ ] DEBUG build blocked by a MSVC bug. In order to resolve it, we'll need to update the MSVC in CI: https://developercommunity.visualstudio.com/content/problem/225957/fatal-error-lnk1318-unexpected-pdb-error-ok-0.html Pull Request resolved: https://github.com/pytorch/pytorch/pull/16008 Differential Revision: D13709527 Pulled By: ezyang fbshipit-source-id: e8365bc75d9ec64099093f7001f83d99a06b196b	2019-01-16 23:34:32 -08:00
Jesse Hellemn	406b9c49bd	Fix Python path finding for benchmark tests Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16022 Differential Revision: D13673792 Pulled By: pjh5 fbshipit-source-id: 177a823ef343b7f60e26ad9ef51415332045438d	2019-01-15 10:48:40 -08:00
Jesse Hellemn	8964a2e6e6	Split Caffe2 CI into cmake-only and python builds (#15917 ) Summary: bypass-lint - Change all Caffe2 builds to use setup.py instead of cmake - Add a -cmake- Caffe2 build configuration that uses cmake and only builds cpp - Move skipIfCI logic from onnx test scripts to the rest of CI logic - Removal of old PYTHONPATH/LD_LIBRARY_PATH/etc. env management Pull Request resolved: https://github.com/pytorch/pytorch/pull/15917 Reviewed By: orionr Differential Revision: D13637583 Pulled By: pjh5 fbshipit-source-id: c5c5639db0251ba12b6e4b51b2ac3b26a8953153	2019-01-14 15:20:44 -08:00
Sebastian Messmer	d408324350	Move files to/from c10/core and c10/util (#15316 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15316 This starts cleaning up the files in c10 according to the module structure we decided on. Move to c10/util: - Half.h, Half-inl.h, Half.cpp, bitcasts.h Move to c10/core: - Device.h, Device.cpp - DeviceType.h, DeviceType.cpp i-am-not-moving-c2-to-c10 Reviewed By: dzhulgakov Differential Revision: D13498493 fbshipit-source-id: dfcf1c490474a12ab950c72ca686b8ad86428f63	2019-01-10 16:22:22 -08:00
peter	0ed3f766e9	Unify flags and environmental variable when building LibTorch/PyTorch (#15868 ) Summary: Fixes #15858. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15868 Differential Revision: D13622354 Pulled By: soumith fbshipit-source-id: bb8c49520ebf926c6194d42db75accba867018c7	2019-01-10 06:47:14 -08:00
andersj	8a5ba577c1	Revert "remove use of tmp_install" (#15847 ) Summary: This reverts commit `04bf528589`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15847 Differential Revision: D13603174 Pulled By: anderspapitto fbshipit-source-id: ae321434d3345ad94fad67bf71fd027cddeb4588	2019-01-08 16:30:19 -08:00
Jesse Hellemn	4f51ca490e	Correcting source pybind11 library to install into Python Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15836 Reviewed By: anderspapitto Differential Revision: D13601331 Pulled By: pjh5 fbshipit-source-id: 36785c501774c01f47acb49cdac265b2c95a5040	2019-01-08 15:06:55 -08:00
andersj	04bf528589	remove use of tmp_install Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14553 Differential Revision: D13583335 Pulled By: anderspapitto fbshipit-source-id: 8711fead9eda877c1037a0bc59f91a3d2e01f3e0	2019-01-04 13:48:12 -08:00
Soumith Chintala	4c5b1cc026	version bump to 1.1 (#15554 ) Summary: version bump to 1.1 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15554 Differential Revision: D13550818 Pulled By: soumith fbshipit-source-id: 8a28582c98b42c081e103581551a01fd96c9f42d	2018-12-26 15:44:25 -08:00
Peter Goldsborough	ad6799537e	Support stateful dataset (#15096 ) Summary: Currently re-implements the dataloader for stateful datasets. Outstanding work: - Refactor DataLoader and DataLoader2 to have common base classes and only differ in specifi pieces of logic, - Figure out how to not duplicate the `MapDataset` logic for stateful vs. non-stateful Pull Request resolved: https://github.com/pytorch/pytorch/pull/15096 Differential Revision: D13522043 Pulled By: goldsborough fbshipit-source-id: 08e461ca51783047f11facc4d27dfa2e4f1e4c2a	2018-12-24 06:26:40 -08:00
peter	d71fac20eb	Refactor hotpatch_vars and apply it to libtorch (#14976 ) Summary: Fixes #14801. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14976 Differential Revision: D13485381 Pulled By: soumith fbshipit-source-id: 0af3c2e1b90988d56f6f85632328d1e4b788ffd2	2018-12-16 21:53:31 -08:00
Junjie Bai	bdfff2f8c2	Add missing caffe2_hip extension in setup.py Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15189 Reviewed By: orionr Differential Revision: D13457644 Pulled By: bddppq fbshipit-source-id: c2363e9b8fd21709b62777e5b2199f01ec1c65f8	2018-12-13 15:59:51 -08:00
Zachary DeVito	92314c83fa	re-enable copy of python files, but be careful that the copy is only … (#14982 ) Summary: …done once This allow no-op build to work correctly even when BUILD_CAFFE2_OPS is on. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14982 Differential Revision: D13413960 Pulled By: zdevito fbshipit-source-id: 6e5412a8c375af8a47c76f548cdd31cff15f3853	2018-12-11 16:54:08 -08:00
Orion Reblitz-Richardson	687834dcb4	Install cpp tests when built (#15000 ) Summary: This is broken out of https://github.com/pytorch/pytorch/pull/13733/ We want to install cpp tests so they can ultimately be runnable from that location for Caffe2 tests run from PyTorch builds. cc pjh5 yf225 anderspapitto Pull Request resolved: https://github.com/pytorch/pytorch/pull/15000 Reviewed By: pjh5 Differential Revision: D13416253 Pulled By: orionr fbshipit-source-id: 51280be0a22557a742f90c9f303c58c35cbd4a38	2018-12-11 10:07:48 -08:00
Jesse Hellemn	5222a1b190	Fixing reading of FBGEMM from env variables Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15023 Reviewed By: orionr Differential Revision: D13406778 Pulled By: pjh5 fbshipit-source-id: 2265f01170fb7969cbdf4e44ca6ef183f5d8017d	2018-12-10 18:18:38 -08:00
Zachary DeVito	e747acbebb	Respect -q of setup.py (#14972 ) Summary: 1. Changes the prints along the 'rebuild' pathway to respect the '-q' flag of setup.py A clean rebuild now only prints: [zdevito@devgpu172.prn2 /data/users/zdevito/pytorch] python setup.py -q rebuild develop [0/1] Install the project... -- Install configuration: "RelWithDebInfo" ninja: no work to do. ninja: no work to do. ninja: no work to do. ninja: no work to do. ninja: no work to do. ninja: no work to do. 2. Deletes apparently dead calls to `generate_code`. Now that CMake builds these files, it appears that it is getting called twice and the second version is never used. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14972 Reviewed By: soumith Differential Revision: D13396330 Pulled By: zdevito fbshipit-source-id: 83c45143bbc6a6d2c1cfee929291ec059f2b5dc3	2018-12-09 22:47:49 -08:00
Sergei Nikolaev	a0ee3a279c	USE_TENSORRT support and TensorRT 5 compatibility Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13945 Differential Revision: D13317525 Pulled By: yinghai fbshipit-source-id: 8630dfec1bbc5aac19539e344e7c38a7fd8b051d	2018-12-07 14:01:11 -08:00
HB_alon	5e307bd1be	use "Extension" instead of the unimported "setuptools.Extension" (#14475 ) Summary: use "Extension" instead of the unimported "setuptools.Extension" Pull Request resolved: https://github.com/pytorch/pytorch/pull/14475 Differential Revision: D13356219 Pulled By: ezyang fbshipit-source-id: 5a3e7eb73a32d6bf09676efd9eddded5586435cd	2018-12-05 22:18:47 -08:00
Soumith Chintala	aa842fe101	clean up linkage options (#14609 ) Summary: minor code cleanup Differential Revision: D13277803 Pulled By: soumith fbshipit-source-id: 5ef925fe95037cab540b329054d7070c1ea7031e	2018-11-30 09:36:59 -08:00
andersj	fb7e40b7eb	nccl fixes (#14195 ) Summary: This has 4 changes 1) propagate USE_SYSTEM_NCCL. Previously it was ignored and cmake always did a FindPackage 2) respect SCCACHE_DISABLE in our caffe2 sccache wrapper for circleci 3) use SCCACHE_DISABLE when building nccl, because it triggers the same bug as when using CCACHE (already tracked in https://github.com/pytorch/pytorch/issues/13362). This was hidden because we weren't respecting USE_SYSTEM_NCCL, and were never building nccl ourselves in CI 4) In one particular CI configuration (caffe2, cuda 8, cudnn 7), force USE_SYSTEM_NCCL=1. Building the bundled nccl triggers a bug in nvlink. I've done some investigation, but this looks like a tricky, preexisting bug, so rather than hold up this diff I'm tracking it separately in https://github.com/pytorch/pytorch/issues/14486 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14195 Differential Revision: D13237502 Pulled By: anderspapitto fbshipit-source-id: 1100ac1269c7cd39e2e0b3ba12a56a3ce8977c55	2018-11-28 14:43:06 -08:00
Sebastian Messmer	50e9c56830	Move Scalar and ScalarType to c10/core Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14022 Reviewed By: ezyang Differential Revision: D13015236 fbshipit-source-id: 92aac4e342d85f75a31837b2943fa5b80f0c35c9	2018-11-27 12:59:36 -08:00
Zachary DeVito	788d2e87bd	Address jittering issues in python_print (#14064 ) Summary: export - print a method with python_print import - import a method with import_method We want to ensure: export(g) == export(import(export(g))) That is after after exporting/importing once, the graph will stay exactly the same. This is less strict that g == import(export(g)) which would require us to maintain a lot more information about the structure of the IR and about the names of debug symbols. This PR addresses this with the following fixes: * print out double-precision numbers with high enough precision such that they always parse in the same way * when creating loop-carried dependencies, sort them by variable name, ensuring a consistent order * parse nan correctly * DCE: remove unused outputs of if statements, and loop-carried dependencies in loops that are dead both after the loop and inside the body of the loop. * Do not set uniqueName for variables whose names are _[0-9]+, these are probably rare in user code, and we need a way to communicate that we do not care about a variable name when re-parsing the graph. Otherwise temporary variable names will jitter around. * Expand the definition of a constant in printing code to None, and family. * Allow re-treeing to work as long as the only thing in its way is a constant node. These do not have side effects but are sometimes inserted in a different order when tracing compared to how we print them. * Print all constant nodes out first in the order in which they are used_val (or, if they are inlined, ensure they get assigned CONSTANT.cX number in a consistent order). Cleanup tuples (this is done in the compiler, but not in the tracer, leading to some tuple indexing jitter if not done). * use strtod_l, not std::stod which can throw exceptions Other: * Add REL_WITH_DEB_INFO to setup.py. It already existed for the cmake files. Threading it into setup.py allows us to turn on debug symbols with optimization everywhere. * enable round trip testing for all generated graphs. This only adds ~6 seconds to total build time but tests printing for every graph. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14064 Differential Revision: D13094637 Pulled By: zdevito fbshipit-source-id: 0a1c6912194d965f15d6b0c6cf838ccc551f161d	2018-11-21 06:38:29 -08:00
Edward Yang	48099c23b4	Move AT_CUDA_CHECK to c10 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13910 Reviewed By: smessmer Differential Revision: D13046201 fbshipit-source-id: 8d360a0e4d6c2edf070d130e600c6b04f0ee0058	2018-11-19 08:20:10 -08:00
Anders Papitto	2983998bb3	add torch-python target (#12742 ) Summary: This is the next minimal step towards moving _C into cmake. For now, leave _C in setup.py, but reduce it to an empty stub file. All of its sources are now part of the new torch-python cmake target. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12742 Reviewed By: soumith Differential Revision: D13089691 Pulled By: anderspapitto fbshipit-source-id: 1c746fda33cfebb26e02a7f0781fefa8b0d86385	2018-11-16 11:43:48 -08:00
Edward Yang	fbabe5bf62	Rename c10::detail to c10::impl (#13838 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13838 According to Sebastian, the detail convention is specifically for header-private functionality. That's not what c10/detail is; it's general, library private headers which may be used in multiple places within PyTorch. Rename it to impl to avoid the confusion in nomenclature. Reviewed By: smessmer Differential Revision: D13024368 fbshipit-source-id: 050f2632d83a69e3ae53ded88e8f938c5d61f0ef	2018-11-14 07:39:37 -08:00
jario-jin	0bedaf9cf6	Update setup.py to support Nvidia TX2 (#13939 ) Summary: add platform.machine() == 'aarch64' for supporting Nvidia TX2 Pull Request resolved: https://github.com/pytorch/pytorch/pull/13939 Differential Revision: D13055834 Pulled By: soumith fbshipit-source-id: 0fadc87adf9e6b796978ce743e824eb98b006856	2018-11-13 20:10:35 -08:00
CircleCI	f1a2bc4eae	Corrected python lib path on windows to be consistent with Linux (#13848 ) Summary: The python lib path on Windows was set to an incorrect path. This fixes it to be consistent with Linux. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13848 Differential Revision: D13030945 Pulled By: soumith fbshipit-source-id: 7fb9013ffe66cff98018aea25fdb5cda03cbceb1	2018-11-12 14:39:55 -08:00
Johannes M Dieterich	53a3c46950	Switch to packaged Thrust on Ubuntu, enable CentOS 7.5 as a CI target (#12899 ) Summary: 1) Use the hip-thrust version of Thrust as opposed to the GH master. (ROCm 267) 2) CentOS 7.5 docker (ROCm 279) * Always install the libraries at docker creation for ubuntu. * Add Dockerfile for CentOS ROCm * Enable the centos build * Source devtoolset in bashrc * Set locales correctly depending on whether we are on Ubuntu or CentOS * Install a newer cmake for CentOS * Checkout thrust as there is no package for CentOS yet. PyTorch/Caffe2 on ROCm passed tests: https://github.com/ROCmSoftwarePlatform/pytorch/pull/280 For attention: bddppq ezyang Docker rebuild for Ubuntu not urgent (getting rid of Thrust checkout and package install is mainly cosmetic). If docker for CentOS 7.5 is wanted, build is necessary. Build of PyTorch tested by me in CentOS docker. PyTorch unit tests work mostly, however, a test in test_jit causes a python recursion error that seems to be due to the python2 on CentOS as we haven't ever seen this on Ubuntu - hence please do not enable unit tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12899 Differential Revision: D13029424 Pulled By: bddppq fbshipit-source-id: 1ca8f4337ec6a603f2742fc81046d5b8f8717c76	2018-11-12 14:39:54 -08:00
David Brownell	75bf877534	Preventing error where ninja build files are overwritten when invokin… (#13698 ) Summary: …g clean and build together Pull Request resolved: https://github.com/pytorch/pytorch/pull/13698 Differential Revision: D13030905 Pulled By: soumith fbshipit-source-id: 234576ac92e0aa8c2d2409958d3cf85eb29ed1f3	2018-11-12 14:39:48 -08:00
Edward Yang	e35418b3be	New implementations of DeviceGuard, StreamGuard and MultiStreamGuard (with CUDA specializations) (#13342 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13342 This PR introduces a few new concepts: - DeviceGuardImplInterface, and implementations for CPU and CUDA, which provide a generic interface for interfacing with device and stream state, without requiring a direct dependency on the code in question. - InlineDeviceGuard, a general template for generating both specialized and dynamically dispatched device guard implementations. Dynamic dispatch is done by specializing it on a VirtualGuardImpl. - Provide a device-independent DeviceGuard class, which can be used even from CPU code. It uses the aforementioned dynamic dispatch. - CUDA-specialized CUDAGuard class, which doesn't have a dynamic dispatch but can only be used from CUDA. - StreamGuard, which is the same as above, but for streams rather than devices. - Optional variants of all the aforementioned guards, which are a no-op if no device/stream is specified - CUDAMultiStreamGuard, specifically for the case when we want to set a device on every guard. There are some subtle semantic changes, which have been thoroughly documented in the class definition. BC-breaking changes: - Move constructor/assignment have been removed from all device guard implementations. - In some cases where you previously wrote 'set_device' (or 'set_stream'), you now must write 'reset_device', because if you switch devices/device types, the stream/device on the previous device is unset. This is different from previous behavior. - CUDAGuard no longer handles streams, or multiple streams. Use CUDAStreamGuard or CUDAMultiStreamGuard as appropriate for your use case. Reviewed By: dzhulgakov Differential Revision: D12849620 fbshipit-source-id: f61956256f0b12be754b3234fcc73c2abc1be04e	2018-11-11 12:11:10 -08:00
Tongzhou Wang	a63ef1d605	Suggest git submodule update --init --recursive (#13769 ) Summary: We now have submodules that have submodules Pull Request resolved: https://github.com/pytorch/pytorch/pull/13769 Reviewed By: soumith Differential Revision: D13000203 Pulled By: SsnL fbshipit-source-id: 63c0c19c6c9d25ae3bf255a2421a82ca68278866	2018-11-09 08:41:44 -08:00
Freddie Mendoza	a8e303dc46	change USE_MKLDNN default from ON (from #13303 ) to OFF for ppc64le (#13759 ) Summary: MKLDNN is not supported on ppc64le change USE_MKLDNN to OFF for ppc64le Pull Request resolved: https://github.com/pytorch/pytorch/pull/13759 Differential Revision: D12993121 Pulled By: soumith fbshipit-source-id: 539d5cfcff2c03b59fa71e10b52fac333a64c381	2018-11-08 19:33:39 -08:00
Gu, Jinghui	d01cb70497	build with mkl-dnn by default (#13303 ) Summary: build with mkl-dnn by default Pull Request resolved: https://github.com/pytorch/pytorch/pull/13303 Reviewed By: yinghai Differential Revision: D12979633 Pulled By: orionr fbshipit-source-id: 00d23fa27c0d13e82f7e5acb3ebd00ed7ba1d5dc	2018-11-08 11:18:27 -08:00
Peter Goldsborough	d4f9dbfa66	Remove catch check Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13677 Differential Revision: D12961992 Pulled By: goldsborough fbshipit-source-id: 1f0207704d05ac67ed1ec1502bec617c845d9f79	2018-11-07 12:27:15 -08:00
Daya S Khudia	18de330e86	CMake integration for int8 server operators Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13558 Reviewed By: Maratyszcza Differential Revision: D12945460 Pulled By: dskhudia fbshipit-source-id: 1a91027b305fd6af77eebd9a4fad092a12f54712	2018-11-06 15:45:15 -08:00
Soumith Chintala	a7ee632dff	Various Test and build fixes (#13556 ) Summary: - fixes weights-contiguous requirement for THCUNN Convolutions - Add tests that conv backward pass works for non-contiguous weights - fix RNN tests / error messages to be consistent and pass - relax weight grad precision for fp16 for a particular test - fix regression of CMAKE_PREFIX_PATH not passing through - add missing skipIfNoLapack annotations where needed Differential Revision: D12918456 Pulled By: soumith fbshipit-source-id: 8642d36bffcc6f2957800d6afa1e10bef2a91d05	2018-11-06 07:13:47 -08:00
Ilija Radosavovic	9e432b593d	Include caffe2 proto headers in pytorch package data (#13217 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13217 Caffe2 proto headers are not included in pytorch package data (https://github.com/pytorch/pytorch/blob/master/setup.py#L1180). However, they are required for building custom Caffe2 ops living outside PyTorch/Caffe2 repo (e.g. custom Detectron ops). Reviewed By: pjh5 Differential Revision: D12815881 fbshipit-source-id: 4d1aaa6a69a2193247586e85e4244fbbdb3e8192	2018-11-03 16:19:39 -07:00
Pieter Noordhuis	24839aac59	Link libgloo.a after libc10d.a to resolve remaining symbols (#13462 ) Summary: libcaffe2.so depends on libgloo.a for the ops in caffe2/contrib/gloo. Symbols in libgloo.a that are not used are ignored and don't end up in libcaffe2.so. libc10d.a depends on the caffe2 target, which in turn depends on the gloo target, and it expects all libgloo.a symbols to be part of libcaffe2.so. Symbols from libgloo.a that are not used in libcaffe2.so remain undefined in libc10d.a. To fix this, we link to libgloo.a when linking _C.so, such that any gloo symbols in libc10d.a are resolved when linking _C.so. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13462 Differential Revision: D12892830 Pulled By: pietern fbshipit-source-id: 7560b3899b62f76081b394498480e513a84cefab	2018-11-01 16:03:33 -07:00
David Brownell	50a8f8531b	Updated for for arbitrary command line arg ordering Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13253 Differential Revision: D12829884 Pulled By: soumith fbshipit-source-id: 9d8abcdf635e2daffce80ddf1e0e418a1e4c337d	2018-10-29 15:52:03 -07:00
Anders Papitto	380d2dfb27	absorb nccl (#13150 ) Summary: always build nccl from within the main cmake build, rather than via a separate invocation in build_pytorch_libs.sh. Use the existing caffe2 codepaths Pull Request resolved: https://github.com/pytorch/pytorch/pull/13150 Differential Revision: D12815674 Pulled By: anderspapitto fbshipit-source-id: a710b6f242d159b9816911a25ee2c4b8c3f855aa	2018-10-29 12:04:32 -07:00
Gu, Jinghui	dbab9b73b6	seperate mkl, mklml, and mkldnn (#12170 ) Summary: 1. Remove avx2 support in mkldnn 2. Seperate mkl, mklml, and mkldnn 3. Fix convfusion test case Pull Request resolved: https://github.com/pytorch/pytorch/pull/12170 Reviewed By: yinghai Differential Revision: D10207126 Pulled By: orionr fbshipit-source-id: 1e62eb47943f426a89d57e2d2606439f2b04fd51	2018-10-29 10:52:55 -07:00
Sam Gross	e6ce9f303f	Check that QNNPACK directory exists in setup.py Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13174 Differential Revision: D12808599 Pulled By: colesbury fbshipit-source-id: 2548a024043f32ee570378dfead8880b00608478	2018-10-26 14:37:11 -07:00
Marat Dukhan	5e73b828bd	CMake integration for Int8 ops Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13145 Differential Revision: D10860849 Pulled By: Maratyszcza fbshipit-source-id: fdbcc23ff9beaeaedfd561176df6cfe87685c1f5	2018-10-25 22:25:10 -07:00
Anders Papitto	e07e63f0b3	Absorb shm Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13088 Differential Revision: D10856067 Pulled By: anderspapitto fbshipit-source-id: cfbf0f6cad3953e1ee1c55482c00a3db9f140594	2018-10-25 13:55:23 -07:00
Anders Papitto	b883afc928	Absorb c10d into the main cmake build Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12953 Differential Revision: D10850274 Pulled By: anderspapitto fbshipit-source-id: 42296e6e49ad8c1845040e031eab95ddbaf58ae4	2018-10-24 22:34:00 -07:00
Anders Papitto	69906afaee	absorb THD into main cmake build (#12775 ) Summary: We want to move _C into the same cmake invocation that builds libcaffe2 and libtorch. However, _C depends on THD and c10d, which in turn depend on libcaffe2. That means that we can't move _C into that cmake file unless we do these two first. This change does so. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12775 Differential Revision: D10457374 Pulled By: anderspapitto fbshipit-source-id: 2c1aa3b8a418a73d2112e93c7da53a2e70cf7bba	2018-10-24 21:28:37 -07:00
Anders Papitto	2dacf28b66	link libgloo_cuda.a explictly from setup.py (#12951 ) Summary: rather than pass a list through a text file Pull Request resolved: https://github.com/pytorch/pytorch/pull/12951 Differential Revision: D10528309 Pulled By: anderspapitto fbshipit-source-id: d94befcd61b6304815859694b623046f256462df	2018-10-24 13:19:46 -07:00
Yangqing Jia	52beb338ab	Add Modules_CUDA_Fix folder to installed folder (#13013 ) Summary: This is used to patch our cmake cuda scripts - should be in the installation script. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13013 Reviewed By: ir413 Differential Revision: D10519104 Pulled By: Yangqing fbshipit-source-id: 542049224ea41068f32d4c0f6399c7e8b684f764	2018-10-24 10:16:18 -07:00
Anders Papitto	8f51c513a6	gloo: build once, share between pytorch/caffe2 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12885 Differential Revision: D10492244 Pulled By: anderspapitto fbshipit-source-id: 79af1ceb9bb0dab4585a728e64554ff4f38d6c32	2018-10-22 11:06:14 -07:00
Peter Goldsborough	a022fd2d6b	Implement DataLoader (#11918 ) Summary: This PR implements a DataLoader API for the C++ frontend. The components present in this API largely match the Python API. It consists of: - `Dataset`s: Conceptually a function from a set of indices to a batch of examples; - `Transform`s: A functional transformation of a dataset. A `Map<D, T>` for Dataset `D` and transform `T` is itself a dataset; - `Sampler`s: Specify a strategy for generating indices for a new batch; - A `DataLoader`, with the ability to automatically parallelize fetching of samples across multiple worker threads; Note that collation functions fall naturally out of the `Map<Dataset, Transform>` abstraction. Things that are missing right now that maybe should be added: - Memory pinning for CUDA tensors The API was designed to be generalizable to almost any kind of dataset, transform or sampling strategy, while providing a convenient API out of the box. To achieve this, it is quite heavily templatized on various possible input types. There are many parts to this PR! Right now, I would like feedback on: - Your impression of the general usability of the API; - Your impression of which parts seem too complex or overthought; - The implementation of the parallelization aspects of the DataLoader. I've followed the Python implementation in some matters, but also differ in others. I think my implementation is a little cleaner and decouples components slightly better than the Python dataloader. I haven't added too many comments yet, as this is fresh out of the oven. Let me know if anything is unclear from the code itself. There also aren't any tests yet. I will write a comprehensive test suite once we agree on the API and implementation. apaszke ezyang The controller you requested could not be found. pietern Pull Request resolved: https://github.com/pytorch/pytorch/pull/11918 Reviewed By: ezyang Differential Revision: D9998881 Pulled By: goldsborough fbshipit-source-id: 22cf357b63692bea42ddb1cc2abc71dae5030aea	2018-10-22 10:22:41 -07:00
JerryShih	0fa69c0276	Remove the protobuf library in pytorch linking list. (#12451 ) Summary: There will be a link error when the caffe2 doesn't use its protobuf under third_party. The pytorch will always link that protobuf. The pytorch doesn't use the protobuf directly. We could remove it from the list. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12451 Differential Revision: D10262676 Pulled By: ezyang fbshipit-source-id: c2ff3fdf757fc21ed689e7f663c082064b1a0bca	2018-10-18 18:31:51 -07:00
Benoit Steiner	bbe6ef3864	torch.finfo and torch.iinfo to mimic the numpy equivalent (#12472 ) Summary: This pull request intends to provide the functionality requested in https://github.com/pytorch/pytorch/issues/10742 by adding a new torch.finfo and torch.iinfo API. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12472 Differential Revision: D10250829 Pulled By: benoitsteiner fbshipit-source-id: eb22ca55d5b0064bef381fa7f1eb75989977df30	2018-10-15 13:43:52 -07:00
Yangqing Jia	713e706618	Move exception to C10 (#12354 ) Summary: There are still a few work to be done: - Move logging and unify AT_WARN with LOG(ERROR). - A few header files are still being plumbed through, need cleaning. - caffe2::EnforceNotMet aliasing is not done yet. - need to unify the macros. See c10/util/Exception.h This is mainly a codemod and not causing functional changes. If you find your job failing and trace back to this diff, usually it can be fixed by the following approaches: (1) add //caffe2/c10:c10 to your dependency (or transitive dependency). (2) change objects such as at::Error, at::Optional to the c10 namespace. (3) change functions to the c10 namespace. Especially, caffe2::MakeString is not overridden by the unified c10::str function. Nothing else changes. Please kindly consider not reverting this diff - it involves multiple rounds of rebasing and the fix is usually simple. Contact jiayq@ or AI Platform Dev for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12354 Reviewed By: orionr Differential Revision: D10238910 Pulled By: Yangqing fbshipit-source-id: 7794d5bf2797ab0ca6ebaccaa2f7ebbd50ff8f32	2018-10-15 13:33:18 -07:00
Philip Yang	b57fdf1db5	Properly set cmake python library and include_dirs (#12569 ) Summary: Properly set cmake python_library and include_dirs hints, so that systems with multiple version of python can still find the correct libraries and header files. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12569 Differential Revision: D10359910 Pulled By: soumith fbshipit-source-id: 2238dcbed7aac8a818c9435e6bba46cda5f81cad	2018-10-12 08:11:21 -07:00
Orion Reblitz-Richardson	25bd7fe488	Add USE_FFMPEG flag for setup.py and R2Plus1D (#12543 ) Summary: Needed for https://github.com/facebookresearch/R2Plus1D/pull/46 Pull Request resolved: https://github.com/pytorch/pytorch/pull/12543 Differential Revision: D10320147 Pulled By: orionr fbshipit-source-id: a7dcbf7c0d4b405b9e89b28ef75a0ed1cf2a3e6a	2018-10-10 18:09:48 -07:00
Teng Li	c5d7494ca1	Use open-source NCCL2 in PyTorch (#12359 ) Summary: - Removed the old nccl file - Make open-source NCCL a submodule - CMake to make NCCL itself NCCL2 now is in the default build. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12359 Reviewed By: orionr, yns88 Differential Revision: D10219665 Pulled By: teng-li fbshipit-source-id: 134ff47057512ba617b48bf390c1c816fff3f881	2018-10-08 15:39:07 -07:00
Sam Gross	f9fb37ca79	Guard Denormals-Are-Zero with runtime CPU check (#12386 ) Summary: Previously, we were only enabling Flush-To-Zero (FTZ) and Denormals-Are-Zero (DAZ) when compiling with SSE3 enabled. After, Christian's patch (https://github.com/pytorch/pytorch/pull/12109) we won't be compiling core files with SSE3 or SSE4 enabled, to better support older AMD processors. This moves the FTZ and DAZ code behind a runtime CPU check in preparation for that change. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12386 Differential Revision: D10222237 Pulled By: colesbury fbshipit-source-id: 7ffe32561ab965e1e5f9eb6e679602bbf4775538	2018-10-05 14:54:54 -07:00
Orion Reblitz-Richardson	895994a7c3	Back out "[pytorch][PR] [Build] Use open-source NCCL2 in PyTorch" Reviewed By: The controller you requested could not be found. fbshipit-source-id: a13075339d3a7b970e81be0b1a32a7c4c3a6c68d	2018-10-04 14:12:04 -07:00
Teng Li	ae7a7fb398	Use open-source NCCL2 in PyTorch (#12312 ) Summary: - Removed the old nccl file - Make open-source NCCL a submodule - CMake to make NCCL itself NCCL2 now is in the default build. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12312 Differential Revision: D10190845 Pulled By: teng-li fbshipit-source-id: 08d42253b774149a66919d194f88b34628c39bae	2018-10-04 11:42:17 -07:00
Sven-Hendrik Haase	080266e79c	Document CUDAHOSTCXX environment variable (#12265 ) Summary: This variable is already being used so this just serves to document that. I think it's an important variable, too, so it should definitely be documented there somewhere. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12265 Differential Revision: D10162261 Pulled By: soumith fbshipit-source-id: e0d01e012c2fedea63372de9967a8eaa3745fe94	2018-10-03 06:33:06 -07:00
daquexian	1fb8925efe	Fix typo LMBD->LMDB in docs of setup.py (#12282 ) Summary: `setup.py` reads `USE_LMDB` rather than `USE_LMBD` Pull Request resolved: https://github.com/pytorch/pytorch/pull/12282 Differential Revision: D10162025 Pulled By: soumith fbshipit-source-id: 6295a777be10509ca49516ad7c10061d26b6f9c9	2018-10-03 06:14:19 -07:00
Edward Yang	1619264ca5	Make ATen-core and caffe2 mutually recursive / merge template data<T>() (#11970 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11970 Adds an ATen-core-headers target, which caffe2_cpu_internal depends on, and makes ATen-core depend on caffe2_headers. If you link against ATen-core, you must ALSO link against caffe2_cpu_internal; if you link against caffe2_cpu_internal, you must ALSO link against ATen-core, otherwise you'll have undefined symbols. Then, we merge template data<T>() method with Caffe2 implementation, demonstrating that includes to Caffe2 (core) from ATen/core are working Reviewed By: jerryzh168 Differential Revision: D9967509 fbshipit-source-id: 3d220c38b2c3c646f8ff2884fdcc889fa9276c7a	2018-09-27 17:40:42 -07:00
Yangqing Jia	9c49bb9ddf	Move registry fully to c10 (#12077 ) Summary: This does 6 things: - add c10/util/Registry.h as the unified registry util - cleaned up some APIs such as export condition - fully remove aten/core/registry.h - fully remove caffe2/core/registry.h - remove a bogus aten/registry.h - unifying all macros - set up registry testing in c10 Also, an important note that we used to mark the templated Registry class as EXPORT - this should not happen, because one should almost never export a template class. This PR fixes that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12077 Reviewed By: ezyang Differential Revision: D10050771 Pulled By: Yangqing fbshipit-source-id: 417b249b49fed6a67956e7c6b6d22374bcee24cf	2018-09-27 03:09:54 -07:00
Orion Reblitz-Richardson	02d7c88fa4	Unify versions across setup.py, libtorch, and libcaffe2 (#12053 ) Summary: This unifies our versions across setup.py, libtorch, and libcaffe2. CMake has a default version (bumped to 1.0.0) that can be overridden by setup.py. The versions are also printed as a part of cmake/Summary.cmake to make sure they are correct. cc Yangqing ezyang soumith goldsborough pjh5 Pull Request resolved: https://github.com/pytorch/pytorch/pull/12053 Differential Revision: D10041878 Pulled By: orionr fbshipit-source-id: a98a01771f6c008d1016ab63ab785c3a88c3ddb0	2018-09-26 08:55:06 -07:00
Peter Goldsborough	e05d689c49	Unify C++ API with C++ extensions (#11510 ) Summary: Currently the C++ API and C++ extensions are effectively two different, entirely orthogonal code paths. This PR unifies the C++ API with the C++ extension API by adding an element of Python binding support to the C++ API. This means the `torch/torch.h` included by C++ extensions, which currently routes to `torch/csrc/torch.h`, can now be rerouted to `torch/csrc/api/include/torch/torch.h` -- i.e. the main C++ API header. This header then includes Python binding support conditioned on a define (`TORCH_WITH_PYTHON_BINDINGS`), which is only passed when building a C++ extension. Currently stacked on top of https://github.com/pytorch/pytorch/pull/11498 Why is this useful? 1. One less codepath. In particular, there has been trouble again and again due to the two `torch/torch.h` header files and ambiguity when both ended up in the include path. This is now fixed. 2. I have found that it is quite common to want to bind a C++ API module back into Python. This could be for simple experimentation, or to have your training loop in Python but your models in C++. This PR makes this easier by adding pybind11 support to the C++ API. 3. The C++ extension API simply becomes richer by gaining access to the C++ API headers. soumith ezyang apaszke Pull Request resolved: https://github.com/pytorch/pytorch/pull/11510 Reviewed By: ezyang Differential Revision: D9998835 Pulled By: goldsborough fbshipit-source-id: 7a94b44a9d7e0377b7f1cfc99ba2060874d51535	2018-09-24 14:44:21 -07:00
Yangqing Jia	a6f1ae7f20	set up c10 scaffolding. Move macros proper first. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11939 Reviewed By: orionr, dzhulgakov Differential Revision: D10004629 Pulled By: Yangqing fbshipit-source-id: ba50a96820d35c7922d81c78c4cbe849c85c251c	2018-09-24 11:09:59 -07:00
Peter Goldsborough	6100c0ea14	Introduce ExtensionVersioner for C++ extensions (#11725 ) Summary: Python never closes shared library it `dlopen`s. This means that calling `load` or `load_inline` (i.e. building a JIT C++ extension) with the same C++ extension name twice in the same Python process will never re-load the library, even if the compiled source code and the underlying shared library have changed. The only way to circumvent this is to create a new library and load it under a new module name. I fix this, of course, by introducing a layer of indirection. Loading a JIT C++ extension now goes through an `ExtensionVersioner`, which hashes the contents of the source files as well as build flags, and if this hash changed, bumps an internal version stored for each module name. A bump in the version will result in the ninja file being edited and a new shared library and effectively a new C++ extension to be compiled. For this the version name is appended as `_v<version>` to the extension name for all versions greater zero. One caveat is that if you were to update your code many times and always re-load it in the same process, you may end up with quite a lot of shared library objects in your extension's folder under `/tmp`. I imagine this isn't too bad, since extensions are typically small and there isn't really a good way for us to garbage collect old libraries, since we don't know what still has handles to them. Fixes https://github.com/pytorch/pytorch/issues/11398 CC The controller you requested could not be found. ezyang gchanan soumith fmassa Pull Request resolved: https://github.com/pytorch/pytorch/pull/11725 Differential Revision: D9948244 Pulled By: goldsborough fbshipit-source-id: 695bbdc1f1597c5e4306a45cd8ba46f15c941383	2018-09-20 14:43:12 -07:00
Mingzhe Li	a7cbcb1bb9	Enable build_python on windows (#11385 ) Summary: The PR aims to resolve issues related to BUILD_PYTHON and BUILD_TEST after FULL_CAFFE2 is removed on Windows. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11385 Reviewed By: orionr Differential Revision: D9884906 Pulled By: mingzhe09088 fbshipit-source-id: fc114c0cbff6223f1ec261161e4caecc1fef5dd6	2018-09-17 21:40:03 -07:00
Bram Wasti	e8ecbcdf01	Move IValue to ATen/core (#11610 ) Summary: unblocks D9202320 Pull Request resolved: https://github.com/pytorch/pytorch/pull/11610 Differential Revision: D9774853 Pulled By: bwasti fbshipit-source-id: 4798223f6de680a7152283e8cad8814da7f90209	2018-09-17 18:25:50 -07:00
Soumith Chintala	73738ec570	bump version to 1.0 (#11717 ) Summary: I'm just doing the honors and bumping the version to 1.0.0. 1.0 preview and RC releases will have the 1.0.0.dev{date} tag Pull Request resolved: https://github.com/pytorch/pytorch/pull/11717 Reviewed By: SsnL Differential Revision: D9840857 Pulled By: soumith fbshipit-source-id: 4c9c2e01dccb3c521dab26c49e1569d970a87ace	2018-09-17 12:13:48 -07:00
Gregory Chanan	e125e61824	Fix flake8 Summary: Fix flake8 Reviewed By: ezyang Differential Revision: D9873872 fbshipit-source-id: 26e81238f22caaeccd2c8b4f39cedb6cfb5520dd	2018-09-17 11:10:29 -07:00
Jesse Hellemn	5bfd8f583c	Moving copy of Caffe2 protos back to build_pytorch_libs.sh (#11726 ) Summary: This way it shows up in all current and future setup.py commands, as otherwise we'd have to override every once to have them all call copy_protos. This is needed because the nightly packages still do not include caffe2_pb2, because setup.py bdist does not go through setup.py install or setup.py develop Pull Request resolved: https://github.com/pytorch/pytorch/pull/11726 Reviewed By: orionr Differential Revision: D9844075 Pulled By: pjh5 fbshipit-source-id: 57b469e48010aacd0c08c214ba8a7e5d757feefa	2018-09-17 08:58:05 -07:00
Soumith Chintala	acb6f18bab	fix generate_code.py caching (#11644 ) Summary: Currently, because of some setup.py logic, `ninja` caching of the `generate_code.py` build step was broken. This resulted in `generate_code.py` running every single time builds were happening, regardless of whether inputs changed. This updated logic fixes the input caching Pull Request resolved: https://github.com/pytorch/pytorch/pull/11644 Reviewed By: orionr Differential Revision: D9814348 Pulled By: soumith fbshipit-source-id: 2012960908d0f600488d410094095cfd72adc34f	2018-09-13 12:39:48 -07:00
Teng Li	6dcdbd3a1d	Make C10d support CPU only build (#11513 ) Summary: This makes torch.distributed works for CPU only build. Also added one more CI test case to cover MPI CPU build. All CI tests should cover this change Pull Request resolved: https://github.com/pytorch/pytorch/pull/11513 Differential Revision: D9784546 Pulled By: teng-li fbshipit-source-id: 0976a6b0fd199670926f0273e17ad7d2805e42e7	2018-09-11 22:10:34 -07:00
Zachary DeVito	289a8c9b7d	Allow train/eval, and non-Tensor arguments to python functions (#11505 ) Summary: This whitelists train/eval functions in script modules, and tests that nested nn.Modules still work. This also changes the code for calling python functions from script to allow non-tensor inputs/outputs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11505 Differential Revision: D9765466 Pulled By: zdevito fbshipit-source-id: 1177bff931324422b69e18fa0bbaa82e3c98ec69	2018-09-11 15:05:09 -07:00
Orion Reblitz-Richardson	d32b41003a	Copy protos on install same as develop (#11517 ) Summary: This is a potential fix for https://github.com/pytorch/pytorch/issues/11453 and https://github.com/pytorch/pytorch/issues/11074 worked through with pjh5 . Turns out we had some protos copy code that was in the .sh file that was removed. Better to have it in setup.py, though, same as for develop. cc ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/11517 Differential Revision: D9771911 Pulled By: orionr fbshipit-source-id: 76975d8f71f38d951eaaed0b50dd3ec36dd177a9	2018-09-11 10:09:56 -07:00
Soumith Chintala	4e8d9a4a58	Introducing python setup.py rebuild develop (#11487 ) Summary: This speeds up incremental builds by doing the following changes: - Uses `rsync` instead of `cp` (when `rsync` is found) which is a bit smarter in doing "maybe copy" - Introduces a `rebuild` mode which does not rerun `cmake` in `build_pytorch_libs.sh`. Note: `rebuild` should only be used if you dont add / remove files to the build, as `cmake` is not rerun Current no-op rebuild speedup: - 1m 15s -> 20s There are some lingering bugs. No-op rebuilds rerun `cmake` for two rebuilds (likely that cmake logic is dependent on the install folder, hence kicking off rebuild). So what you see ``` python setup.py rebuild develop # first time - ~5 mins python setup.py rebuild develop # second time - ~3 mins python setup.py rebuild develop # third time - ~2 mins python setup.py rebuild develop # fourth time - ~20 seconds python setup.py rebuild develop # fifth time - ~20 seconds ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/11487 Differential Revision: D9769087 Pulled By: soumith fbshipit-source-id: 20fbecde33af6426149c13767e8734fb3be783c5	2018-09-11 08:56:25 -07:00
Orion Reblitz-Richardson	a175282776	Flags for LMDB, LevelDB, and Caffe2 ops (#11462 ) Summary: Add flags for LMDB and LevelDB, default `OFF`. These can be enabled with ``` USE_LMDB=1 USE_LEVELDB=1 python setup.py build_deps ``` Also add a flag to build Caffe2 ops, which is default `ON`. Disable with ``` NO_CAFFE2_OPS=1 python setup.py build_deps ``` cc Yangqing soumith pjh5 mingzhe09088 Pull Request resolved: https://github.com/pytorch/pytorch/pull/11462 Reviewed By: soumith Differential Revision: D9758156 Pulled By: orionr fbshipit-source-id: 95fd206d72fdf44df54fc5d0aeab598bff900c63	2018-09-10 17:27:50 -07:00
Peter Goldsborough	a0d4106c07	Integrate custom op tests with CI (#10611 ) Summary: This PR is stacked on https://github.com/pytorch/pytorch/pull/10610, and only adds changes in one file `.jenkins/pytorch/test.sh`, where we now build the custom op tests and run them. I'd also like to take this PR to discuss whether the [`TorchConfig.cmake`](https://github.com/pytorch/pytorch/blob/master/cmake/TorchConfig.cmake.in) I made is robust enough (we will also see in the CI) orionr Yangqing dzhulgakov what do you think? Also ezyang for CI changes Pull Request resolved: https://github.com/pytorch/pytorch/pull/10611 Differential Revision: D9597627 Pulled By: goldsborough fbshipit-source-id: f5af8164c076894f448cef7e5b356a6b3159f8b3	2018-09-10 15:40:21 -07:00
Orion Reblitz-Richardson	802d21c8f4	Remove FULL_CAFFE2 flag (#11321 ) Summary: Continuing pjh5's work to remove FULL_CAFFE2 flag completely. With these changes you'll be able to also do something like ``` NO_TEST=1 python setup.py build_deps ``` and this will skip building tests in caffe2, aten, and c10d. By default the tests are built. cc mingzhe09088 Yangqing Pull Request resolved: https://github.com/pytorch/pytorch/pull/11321 Reviewed By: mingzhe09088 Differential Revision: D9694950 Pulled By: orionr fbshipit-source-id: ff5c4937a23d1a263378a196a5eda0cba98af0a8	2018-09-07 15:09:44 -07:00
Peter Goldsborough	01930a3145	Move sync_params to C++ (#9805 ) Summary: The next function I'm moving to C++ is `sync_params`. It is stacked on top of https://github.com/pytorch/pytorch/pull/9729, so some changes will go away when it lands and I rebase. I also split code into a `.h` and `.cpp` file for better code organization. The controller you requested could not be found. pietern apaszke Pull Request resolved: https://github.com/pytorch/pytorch/pull/9805 Differential Revision: D9688604 Pulled By: goldsborough fbshipit-source-id: 4467104d3f9e2354425503b9e4edbd59603e20a8	2018-09-07 12:56:40 -07:00
iotamudelta	9de2085806	Use custom hcc/HIP, purge hcSPARSE (#11198 ) Summary: * purge hcSPARSE now that rocSPARSE is available * integrate a custom hcc and HIP * hcc brings two important compiler fixes (fixes hundreds of unit tests) * HIP brings a smart dispatcher that allows us to avoid a lot of static_casts (we haven't yet removed the automatic static_casts but this catches some occurrences the script did not catch) * mark 5 unit tests skipping that have regressed w/ the new hcc (we don't know yet what is at fault) * optimize bitonic sort - the comparator is always an empty struct - therefore passing it by value saves at least 3 bytes. It also removes an ambiguity around passing references to `__global__` functions Pull Request resolved: https://github.com/pytorch/pytorch/pull/11198 Differential Revision: D9652340 Pulled By: ezyang fbshipit-source-id: f5af1d891189da820e3d13b7bed91a7a43154690	2018-09-06 19:38:07 -07:00
Orion Reblitz-Richardson	dda8402447	Cleanup dependency of distributed flags (#11221 ) Summary: Now that we're building everything together, making all distributed flags conditional of USE_DISTRIBUTED being set. cc pietern The controller you requested could not be found. cpuhrsch Pull Request resolved: https://github.com/pytorch/pytorch/pull/11221 Reviewed By: Yangqing Differential Revision: D9664267 Pulled By: orionr fbshipit-source-id: a296cda5746ad150028c97160f8beacba955ff73	2018-09-06 08:56:00 -07:00
Jesse Hellemn	c0efe6f027	Forward declarations of needed curand functions (#10911 ) Summary: Needed for FULL_CAFFE2=1 with statically linked CUDA libraries. Waiting on advice from Nvidia Pull Request resolved: https://github.com/pytorch/pytorch/pull/10911 Reviewed By: pjh5 Differential Revision: D9636256 Pulled By: orionr fbshipit-source-id: fcad7945910b6c8fb5f52e81cc87dad5fcfb3c65	2018-09-05 16:56:26 -07:00
Richard Zou	68c2e014cb	Handling for py2/py3 division differences (#11016 ) Summary: - In Python 2, use of `/` (regardless of int/float/Tensor) causes a compiler error if `from __future__ import division` is not imported in the file. - The / operator is universally set to do "true" division for integers - Added a `prim::FloorDiv` operator because it is used in loop unrolling. The error if users use '/' in python 2 without importing from __future__ occurs when building the JIT AST. cc apaszke zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/11016 Differential Revision: D9613527 Pulled By: zou3519 fbshipit-source-id: 0cebf44d5b8c92e203167733692ad33c4ec9dac6	2018-09-05 14:57:38 -07:00
Teng Li	020501b7b0	Getting rid of USE_C10D for build (#11237 ) Summary: Will use USE_DISTRIBUTED for both c10d and THD Pull Request resolved: https://github.com/pytorch/pytorch/pull/11237 Differential Revision: D9647825 Pulled By: teng-li fbshipit-source-id: 06e0ec9b5e2f8f38780fc88718f8499463e9e969	2018-09-04 17:27:53 -07:00
iotamudelta	33c7cc13ca	improve docker packages, fix bugs, enable tests, enable FFT (#10893 ) Summary: * improve docker packages (install OpenBLAS to have at-compile-time LAPACK functionality w/ optimizations for both Intel and AMD CPUs) * integrate rocFFT (i.e., enable Fourier functionality) * fix bugs in ROCm caused by wrong warp size * enable more test sets, skip the tests that don't work on ROCm yet * don't disable asserts any longer in hipification * small improvements Pull Request resolved: https://github.com/pytorch/pytorch/pull/10893 Differential Revision: D9615053 Pulled By: ezyang fbshipit-source-id: 864b4d27bf089421f7dfd8065e5017f9ea2f7b3b	2018-09-02 08:54:42 -07:00
Teng Li	3791bd12c8	PT1 Release Milestone No.2 MPI Group Support with all tests passed (#11128 ) Summary: Added MPI group support. And this will make all previous group test cases of MPI passed. Also, release the MPI thread level support by serializing different PG's MPI ops. This is required. The build is fixed too Pull Request resolved: https://github.com/pytorch/pytorch/pull/11128 Differential Revision: D9602188 Pulled By: teng-li fbshipit-source-id: 1d618925ae5fb7b47259b23051cc181535aa7497	2018-08-31 12:39:56 -07:00
Edward Yang	cd9416317d	Minor copy-edit on setup.py Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10933 Reviewed By: cpuhrsch Differential Revision: D9526650 fbshipit-source-id: 8ad1c989bee7009b3f95a2641189f55cf6c1979f	2018-08-29 13:41:04 -07:00
Orion Reblitz-Richardson	3c9775fff8	Remove nanopb since we've switched to protobuf (#10772 ) Summary: We no longer use nanopb in PyTorch (or Caffe2) so removing. All protobuf manipulation should go through standard protobuf, which is statically linked inside libcaffe2.so by default. cc zdevito pjh5 ezyang Yangqing Pull Request resolved: https://github.com/pytorch/pytorch/pull/10772 Reviewed By: pjh5 Differential Revision: D9465894 Pulled By: orionr fbshipit-source-id: 8cdf9f1d3953b7a48478d381814d7107df447201	2018-08-24 10:54:38 -07:00
Orion Reblitz-Richardson	8c13971f57	Remove protobuf require and use requirements.txt (#10771 ) Summary: In prep for making FULL_CAFFE2 default, users shouldn't be required to have protobuf installed. cc pjh5 Pull Request resolved: https://github.com/pytorch/pytorch/pull/10771 Reviewed By: pjh5 Differential Revision: D9474458 Pulled By: orionr fbshipit-source-id: 3e28f5ce64d125a0a0418ce083f9ec73aec62492	2018-08-24 10:39:40 -07:00
Johannes M Dieterich	a4c59a9dab	MIOpen integration, more tests enabled, bug fixes (#10612 ) Summary: * first integration of MIOpen for batch norm and conv on ROCm * workaround a ROCm compiler bug exposed by elementwise_kernel through explicit capture of variables in the densest packing * workaround a ROCm compiler bug exposed by having `extern "C" __host__` as a definition and just `__host__` in the implementation through the hipify script * use fabs() in accordance with C++11 for double absolute, not ::abs() which is integer-only on ROCm * enable test_sparse set on CI, skip tests that don't work currently on ROCm * enable more tests in test_optim after the elementwise_bug got fixed * enable more tests in test_dataloader * improvements to hipification and ROCm build With this, resnet18 on CIFAR data trains without hang or crash in our tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10612 Reviewed By: bddppq Differential Revision: D9423872 Pulled By: ezyang fbshipit-source-id: 22c0c985217d65c593f35762b3eb16969ad96bdd	2018-08-23 15:24:47 -07:00
Edward Yang	227635142f	Delete THD master_worker (#10731 ) Summary: Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/10731 Differential Revision: D9423675 Pulled By: ezyang fbshipit-source-id: 37221e11d84cc3672b944af598ea229a1d4c38cc	2018-08-22 08:54:36 -07:00
Peter Goldsborough	c101a57a74	Build mechanism for custom operators (#10226 ) Summary: This is the last step in the custom operator implementation: providing a way to build from C++ and Python. For this I: 1. Created a `FindTorch.cmake` taken largely from ebetica with a CMake function to easily create simple custom op libraries 2. Created a ` torch/op.h` header for easy inclusion of necessary headers, 3. Created a test directory `pytorch/test/custom_operator` which includes the basic setup for a custom op. 1. It defines an op in `op.{h,cpp}` 2. Registers it with the JIT using `RegisterOperators` 3. Builds it into a shared library via a `CMakeLists.txt` 4. Binds it into Python using a `setup.py`. This step makes use of our C++ extension setup that we already have. No work, yey! The pure C++ and the Python builds are separate and not coupled in any way. zdevito soumith dzhulgakov Pull Request resolved: https://github.com/pytorch/pytorch/pull/10226 Differential Revision: D9296839 Pulled By: goldsborough fbshipit-source-id: 32f74cafb6e3d86cada8dfca8136d0dfb1f197a0	2018-08-16 18:56:17 -07:00
Anders Papitto	130881f0e3	Delete build_caffe2.sh, replace with build_libtorch.py (#10508 ) Summary: delete build_caffe2.sh, replace with build_libtorch.py as suggested by peter (and copy-pasted from his draft PR). This ensures that all consumers of the torch CMake file go through as unified a path as possible. In order to change the surrounding infrastructure as little as possible, I made some tweaks to enable build_pytorch_libs.sh to generate the test binaries relative to the current directory, rather than hardcoding to pytorch/build. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10508 Differential Revision: D9354398 Pulled By: anderspapitto fbshipit-source-id: 05b03df087935f88fca7ccefc676af477ad2d1e9	2018-08-16 08:10:04 -07:00
Orion Reblitz-Richardson	021b4888db	Remove setup_requires and tests_require from setup.py for FULL_CAFFE2 (#10530 ) Summary: In my environment, it looks like setup.py hangs when running ``` FULL_CAFFE2=1 python setup.py build_deps ``` Removing this fixes things, but we might also want to look at `tests_require`, which came over from `setup_caffe2.py`. cc pjh5 Pull Request resolved: https://github.com/pytorch/pytorch/pull/10530 Differential Revision: D9349597 Pulled By: orionr fbshipit-source-id: 589145eca507dfaf16386884ee2fbe60299660b4	2018-08-15 14:26:53 -07:00
Anders Papitto	d1442b36f3	add a rebuild_libtorch command for speedier iteration. (#10036 ) Summary: It just calls into `ninja install`. For iterative work on libtorch.so/_C.so, `python setup.py rebuild_libtorch develop` should provide quick iteration Pull Request resolved: https://github.com/pytorch/pytorch/pull/10036 Differential Revision: D9317869 Pulled By: anderspapitto fbshipit-source-id: 45ea45a1b445821add2fb9d823a724fc319ebdd2	2018-08-14 12:10:02 -07:00
iotamudelta	75651d5b58	improve use of ROCm libraries, enable more tests, small fixes (#10406 ) Summary: * some small leftovers from the last PR review * enable more unit test sets for CI * replace use of hcRNG w/ rocRAND (docker image was already updated w/ newer rocRAND) * use rocBLAS instead of hipBLAS to allow convergence w/ Caffe2 * use strided_batched gemm interface also from the batched internal interface * re-enable Dropout.cu as we now have philox w/ rocRAND Pull Request resolved: https://github.com/pytorch/pytorch/pull/10406 Reviewed By: Jorghi12 Differential Revision: D9277093 Pulled By: ezyang fbshipit-source-id: 7ef2f6fe4ead77e501ed7aea5c3743afe2466ca2	2018-08-13 11:39:43 -07:00
Jesse Hellemn	cd81217f8e	A single print statement in setup.py Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10473 Reviewed By: ml7 Differential Revision: D9299196 Pulled By: pjh5 fbshipit-source-id: f9aa84c2859df12f9da9ac5205e1918c253e19fb	2018-08-13 11:39:42 -07:00
Sam Gross	0b63d12db6	Don't call into Python during Storage destruction. (#10407 ) Summary: ``` This removes PyObjectFinalizer. We were seeing SIGSEGV at exit in some programs that use multiprocessing. The backtrace pointed to StorageRef.__del__ being called from subtype_dealloc. My guess is that the Python interpreter was shutdown before all C++ Storage objects were deallocated. Deallocating the C++ Storage called the finalizer which called back into Python after it was no longer safe to do so. This avoids a callback from C++ into Python during Storage finalization. Instead, dead Storage objects (expired weak references) are collected periodically when shared_cache exceeds a limit. The limit is scaled with 2x the number of live references, which places an upper bound on the amount of extra memory held by dead Storage objects. In practice, this should be very small. ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/10407 Differential Revision: D9272400 Pulled By: colesbury fbshipit-source-id: ecb14d9c6d54ffc91e134c34a4e770a4d09048a2	2018-08-13 11:20:07 -07:00
Jesse Hellemn	def3715e82	Minor changes for nicer pip packages (#9544 ) Summary: I am using this to test a CI job to upload pip packages, and so am using the Caffe2 namespace to avoid affecting the existing pytorch packages. Pull Request resolved: https://github.com/pytorch/pytorch/pull/9544 Reviewed By: orionr Differential Revision: D9267111 Pulled By: pjh5 fbshipit-source-id: a68162ed29d2eb9ce353d8435ccb5f16c3b0b894	2018-08-10 12:09:46 -07:00
Yangqing Jia	40109b16d0	Remove caffe1 specific proto (#10380 ) Summary: This was used as a convenient way for us to convert c1 models. Now that conversion is more or less done, we should probably require any users who need to convert c1 models to explicitly install c1. This PR removes the explicit c1 proto (which was copied from c1) in favor of explicit installation. Note that caffe_translator would still work properly, only difference is that now users need to install c1 separately. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10380 Differential Revision: D9267981 Pulled By: Yangqing fbshipit-source-id: a6ce5d9463e6567976da83f2d08b2c3d94d14390	2018-08-10 11:10:26 -07:00
peter	506142ac8a	Add warning for building PyTorch using Python 2.7 on Windows (#10247 ) Summary: Fixes #9232. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10247 Differential Revision: D9178257 Pulled By: SsnL fbshipit-source-id: cc553335a5a918b6d77fe1064460cb66114859ca	2018-08-05 21:24:02 -07:00
Shuichi KITAGUCHI	df23bdc82d	add BEGIN NOT-CLEAN-FILES marker to .gitignore. (#10233 ) Summary: Using Visual Studio Code and Visual Studio, these IDEs store configurations to `FOLDER/.vscode` and `FOLDER/.vs`. But "setup.py clean" deletes these folders because those are described in `.gitignore` file. To prevent this, add "BEGIN NOT-CLEAN-FILES" marker to `.gitignore` file and "setup.py clean" ignores lines after this marker. Discussed in #10206 Pull Request resolved: https://github.com/pytorch/pytorch/pull/10233 Differential Revision: D9175515 Pulled By: ezyang fbshipit-source-id: 24074a7e6e505a3d51382dc5ade5c65c97deda37	2018-08-05 15:55:44 -07:00
Elias Ellison	170d29769b	Strings lexing, parsing, implementation in print (#9324 ) Summary: This PR adds strings to the ast and implements them for print statements. Strings are lifted as attributes to the print node. They must be arguments to print itself, not as an argument for an object that is passed to print. If they are encountered elsewhere a NYI exception will be thrown. Pull Request resolved: https://github.com/pytorch/pytorch/pull/9324 Reviewed By: jramseyer Differential Revision: D8807128 Pulled By: eellison fbshipit-source-id: 984401ff458ed18d473c6d1bd86750e56c77d078	2018-08-02 11:09:03 -07:00
Gregory Chanan	2d56b5cf8b	Prepare THC for first class scalars (0-dimensional tensors). Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10072 Differential Revision: D9082421 Pulled By: gchanan fbshipit-source-id: d4327b07aaef85cc2521393008154ebceae8cbfd	2018-08-01 14:28:51 -07:00
Edward Yang	37a226de63	When BUILD_ATEN=OFF, use ATen/core directly (#10019 ) Summary: ATenCore.h is a dummy header to just test that this is working at all. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10019 Reviewed By: smessmer Differential Revision: D9067262 Pulled By: ezyang fbshipit-source-id: 58bab9c0aa83b56335e36b719b9b6505400d8dee	2018-07-30 21:09:55 -07:00
Edward Yang	a08119afc2	Eliminate direct access to size/strides of THTensor; replace them with std::vector (#9561 ) Summary: * THTensor now stores `sizes_` and `strides_` which is a `std::vector<int64_t>` * Anywhere a "public" API function made use of a int64_t* of sizes, I opted to just finagle it out of the tensor using THTensor_getSizePtr rather than try to rewrite all of these sites to use ArrayRef. They should use ArrayRef eventually, but not yet. * There are new utility functions for resizing sizes/strides in one go (THTensor_resizeDim), or replacing sizes and strides with completely new values (THTensor_setSizesAndStrides) * Anywhere you said `t->size[n] = 0`, we now say `THTensor_setSizeAt(t, n, 0)`, ditto for strides * Anywhere you said `t->size[n]`, we now say `t->size(n)` (coming soon: ditto for strides) Previous review of just the `std::vector` change in #9518, but I'm planning to merge this all in one go. Note for gchanan: review from commit "ci" and after Pull Request resolved: https://github.com/pytorch/pytorch/pull/9561 Reviewed By: cpuhrsch Differential Revision: D8901926 Pulled By: ezyang fbshipit-source-id: 483cf275060ab0a13845cba1ece39dd127142510	2018-07-19 14:10:06 -07:00
Anders Papitto	4c615b1796	Introduce libtorch to setup.py build (#8792 ) Summary: Prior to this diff, there have been two ways of compiling the bulk of the torch codebase. There was no interaction between them - you had to pick one or the other. 1) with setup.py. This method - used the setuptools C extension functionality - worked on all platforms - did not build test_jit/test_api binaries - did not include the C++ api - always included python functionality - produced _C.so 2) with cpp_build. This method - used CMake - did not support Windows or ROCM - was capable of building the test binaries - included the C++ api - did not build the python functionality - produced libtorch.so This diff combines the two. 1) cpp_build/CMakeLists.txt has become torch/CMakeLists.txt. This build - is CMake-based - works on all platforms - builds the test binaries - includes the C++ api - does not include the python functionality - produces libtorch.so 2) the setup.py build - compiles the python functionality - calls into the CMake build to build libtorch.so - produces _C.so, which has a dependency on libtorch.so In terms of code changes, this mostly means extending the cmake build to support the full variety of environments and platforms. There are also a small number of changes related to the fact that there are now two shared objects - in particular, windows requires annotating some symbols with dllimport/dllexport, and doesn't allow exposing thread_local globals directly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/8792 Reviewed By: ezyang Differential Revision: D8764181 Pulled By: anderspapitto fbshipit-source-id: abec43834f739049da25f4583a0794b38eb0a94f	2018-07-18 14:59:33 -07:00
Chunli Fu	a487b08c2e	AutoBatching - IR transformation(basic operators) (#9198 ) Summary: Use decorator `torch.jit.batch` to implement auto-batching (call `to_batch` pass to do IR tranformation). - `to_batch` pass: "to_batch.h/cpp" in csrc/jit/passess to transform a graph to a new batched graph. - Write several basic operators for BatchTensor (add, mul, sigmoid, tanh, mm, matmul, select). - Register the operators in a lookup table `<std::string, std::shared_ptr<Graph>>`. (use the Graph to replace the original node in IR graph) Move BatchTensor in python from torch.BatchTensor to torch.jit.BatchTensor Pull Request resolved: https://github.com/pytorch/pytorch/pull/9198 Reviewed By: zdevito Differential Revision: D8744466 Pulled By: ChunliF fbshipit-source-id: 9ea56a30f55cb870f13a2069a47cc635419763ff	2018-07-11 18:25:07 -07:00
Adam Paszke	b9f575fc33	Remove legacy code from the JIT (#9323 ) Summary: In particular, get rid of backward tracing and CppOp. Pull Request resolved: https://github.com/pytorch/pytorch/pull/9323 Reviewed By: ezyang Differential Revision: D8795935 Pulled By: apaszke fbshipit-source-id: fb7a7eeee41902da35f2a8efd77262ca60fd6bbe	2018-07-11 10:25:38 -07:00
Zachary DeVito	efefd1d7cf	Unify aten_dispatch and aten_schema into a single operator abstraction with human-readable schema. (#8885 ) Summary: This is a series of two commits that should probably be read separately. They are stacked on top of #9018 since the second commit requires it for correctness. Commit 1 ======= This commit is the first in a series that will clean up how we handle declaring operators and intrinsics in the JIT to make it more modular and readable. This introduces readable declarations that can be used to register operators and switches gen_jit_dispatch to generate this schema. A follow up PR will remove the dispatch keys like "add-3" and resolve ops directly based on the registered schema, further simplifying the generation process. * Switches schema over to parsed declarations, in the future this will allow something like: ``` registry.register_intrinsic("foo(Tensor a, Tensor b) -> Tensor", [](Stack& stack) { ... }) ``` This will allow the scalable registration of intrinsics for lists, tuples, and other ops, as long as meta-data for these ops (e.g. derivatives and size propagation routines). The declarations resemble those used by PythonArgParser but have been singificantly cleaned up to minimize the number of types that can appear in the declaration. We should strive to get the other parts of PyTorch switched over to this restricted declaration set when possible, but it is too much to do in a single PR. My hope is that eventually we will use a very similar language to describe declarations in C10, and this can serve as a guide for that. Parsing is done using the script lexer, so it is very robust to whitespace and extensible for future types. This removes the other way we encoded schema, and makes it easier to see what schema are registered. Current generated declarations: https://gist.github.com/zdevito/a96a17766fb3a098d69a91ee00abaaf6 * Switches how we handle attempting to use an integer in the place of a fixed-sized int list, such as in conv (e.g. 'int[3] stride=1'). Now that we can statically distinguish between int and Tensor, we handle the expansion as an implicit conversion in the compiler. This allows us to simplify the interpreter since it no longer needs to handle the conversion itself. * Schema declarations have been changed so that they match the type system in the IR exactly. In particular, attribute_info which was used by liftConstantAttributes has been dropped and constant attributes are lifted purely based on the type of the input. Type conversions in compiler have been simplified due to this change. * Error highlighting in ErrorReport now only reports at most 20 lines of code, to make reading where an error occurred easier. Commit 2 ======= This commit unifies aten_dispatch and aten_schema into a single Operator object that both contains schema and implementation information. In the future we can use this object to also contain functionality like shape prop and autodiff needed by all operators. Operators are registered globally, and dispatch logic uses the schema information to figure out which variant to use. Descriptor keys, a frequent source of inscrutable debug errors, have been removed. * Introduce Operator, to replace TensorOp. Unlike TensorOp, we use Operator for all op implementations, including primitives that may occur in the graphs. The only exceptions are ops that are only known to the interpreter like jumps, and GraphExecutors where we need to record additional debug info. * Adds a global registry for Operator implementations. aten_dispatch.cpp turns into register_aten_ops.cpp, which registers all the Operators for aten with the operator registry. register_prim_ops.cpp now contains the implementations for primitive operators that used to be in the interpreter. This means that it is now safe to use `getOperation(node)` to lookup the true interpreter function for the node, which will simplify const-propagation passes. * Remove addInterpreterOpHandler in favor of global operator registry. * Instead of descriptors, we match Node arguments directly against FunctionSchema describing expected inputs in `matchSchema`. `matchSchema` knows how parse both attributes and positional inputs from a node and match it to the appropriate registered operator. Debug error messages when we try to run an invalid operator are significantly improved: they now automatically display the schema for the op with the same name that are registered. * Merge aten_schema into regsiter_aten_ops. Each Operator takes a string schema which is parsed to determine when to dispatch to that op. * Cleans up gen_jit_dispatch.py now that we do not need to write out descriptors. In particular, skip_scalar_overloads can be removed since Richard's code sorts declarations to put Tensor, Tensor declarations first. * remove matchSchemaAndLiftConstantAttributes and use emitBuiltinCall instead to remove code duplication * refactor stack manipulation functions into a separate header file. Pull Request resolved: https://github.com/pytorch/pytorch/pull/8885 Reviewed By: jamesr66a Differential Revision: D8751048 Pulled By: zdevito fbshipit-source-id: 312aabfbf88307c5f6ab947b6caf691468b94557	2018-07-10 10:24:48 -07:00
Edward Yang	d0d1820814	Add weak pointer and finalizer support directly to THStorage. (#9148 ) Summary: The underlying use-case is the file descriptor to storage cache in torch.multiprocessing.reductions. Previously, this was implemented by wrapping an existing allocator with a "weak ref" allocator which also knew to null out the weak reference when the storage died. This is terribly oblique, and prevents us from refactoring the allocators to get rid of per-storage allocator state. So instead of going through this fiasco, we instead directly implement weak pointers and finalizers in THStorage. Weak pointers to THStorage retain the THStorage struct, but not the data_ptr. When all strong references die, data_ptr dies and the finalizers get invoked. There is one major hazard in this patch, which is what happens if you repeatedly call _weak_ref on a storage. For cleanliness, we no longer shove our grubby fingers into the finalizer struct to see if there is already a Python object for the weak reference and return it; we just create a new one (no one is checking these Python objects for identity). This means if you keep calling it, we'll keep piling on finalizers. That's bad! But I am not going to fix it until it is actually a problem for someone, because then we need to add another caching layer. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/9148 Differential Revision: D8729106 Pulled By: ezyang fbshipit-source-id: 69710ca3b7c7e05069090e1b263f8b6b9f1cf72f	2018-07-10 06:25:33 -07:00
Peter Goldsborough	4498fb962b	Add space around operator (#9294 ) Summary: Fixes lint failure on master Pull Request resolved: https://github.com/pytorch/pytorch/pull/9294 Differential Revision: D8779010 Pulled By: goldsborough fbshipit-source-id: da1ea2604189fd704c22fa8a5770bd92845cea91	2018-07-09 20:24:21 -07:00
Jesse Hellemn	99ab082366	Making setup.py install work for Caffe2 (#8509 ) Summary: Tested on my mac on a pretty clean anaconda3 Pull Request resolved: https://github.com/pytorch/pytorch/pull/8509 Reviewed By: orionr Differential Revision: D8702257 Pulled By: pjh5 fbshipit-source-id: eda03ef9732da9fc56b31d909af5c0e39520d689	2018-07-09 18:10:58 -07:00
Zachary DeVito	819815d9c0	Fix missing compile_commands.json for aten (#9227 ) Summary: When we moved the libaten build into libcaffe2, we changed the location where it generated compile_commands.json such that it was no longer being picked up by the build script. This fixes it so it is still found. Pull Request resolved: https://github.com/pytorch/pytorch/pull/9227 Reviewed By: goldsborough Differential Revision: D8757984 Pulled By: zdevito fbshipit-source-id: 73df26bf08d98f18ac841d6c0db7e332fd328ab6	2018-07-08 16:54:34 -07:00
Francisco Massa	f6027bb15d	Install hpp headers for CPP Extensions (#9182 ) Summary: With the Cppzation of a few files in `TH`/`THC`, the CPP extensions got broken whenever the user uses feature from `THC` in their files, when pytorch is installed via `python setup.py install`. This addresses issues such as ``` /home/me/.conda/envs/pytorch/lib/python3.6/site-packages/torch/lib/include/THC/THCDeviceTensorUtils.cuh:5:25: fatal error: THCTensor.hpp: No such file or directory ``` Closes https://github.com/pytorch/pytorch/pull/9182 Reviewed By: soumith Differential Revision: D8734581 Pulled By: fmassa fbshipit-source-id: 2a1138f208592eaccb01fcdb805a6b369d7a497a	2018-07-05 07:55:25 -07:00
Roy Li	c61f0217a5	combine size_average and reduce args in loss functions (#8018 ) Summary: closes #7929 Closes https://github.com/pytorch/pytorch/pull/8018 Differential Revision: D8682540 Pulled By: li-roy fbshipit-source-id: 649170dd1a7f373151c1d4e949838bd1c5651936	2018-07-01 05:39:00 -07:00
Chunli Fu	67b21117b7	Add BatchTensor class (#8922 ) Summary: Add BatchTensor class - construct from data, mask, dims or construct from list of tensors - can return a list of tensors from an BatchTensor class next step: do IR level transformation and operators Closes https://github.com/pytorch/pytorch/pull/8922 Differential Revision: D8668986 Pulled By: ChunliF fbshipit-source-id: 8b24d2a9f46a3b42dbb397e99e9e059dfb2b326e	2018-06-29 15:57:27 -07:00
Zachary DeVito	f74207c99f	Allow autograd to work even when the shape of values cannot be determined (#8641 ) This commit implements the solution proposed in https://github.com/pytorch/pytorch/issues/8410 to workaround the need to create zero tensors with the same shape as inputs. It introduces the concept of a LinearBlock which marks places in the code where we know if all the inputs to the node are zero, then the outputs to the node are also zero. Autodiff introduces LinearBlocks around backwards functions, which have this property. specializeUndef then propagates Undef nodes using this information. Notes: * Since we do not always specialize, we have a pass LowerLinearBlocks that replaces the block with an if statement that dynamically guards the Undef case. * We introduce AutogradAdd which is addition that still works when its inputs might be undefined. In cases where we specialize this will get removed in favor of a normal add, but there are cases where gradient graphs do not specialize (e.g. when they are not differentiable, but a derivative is required) so it is important for this op to be executable.	2018-06-25 18:40:04 -07:00
Orion Reblitz-Richardson	5a7b4840d9	Move nanopb-generated ONNX to unique file name (#8773 ) * Move nanopb-generated ONNX to unique file name * fix other places	2018-06-22 09:51:56 -04:00
Richard Zou	8489c4cc6e	Better support for literals in jit script (#8687 ) Addresses #8177 A design doc can be found here: [gist](https://gist.github.com/zou3519/4b7f13f03cc9f3612bd9363e6405fa0a) version or [quip](https://fb.quip.com/azL1AqUckBdo) version General approach: - Add NumberType, FloatType, IntType to represent Python numbers, floats and ints. - Emit these types for python literals - Change aten_schema such that Scalars are NumberType, int64_t and bool are IntType. - Emit aten::type_as, prim::NumToTensor, and prim::TensorToNum nodes for tensor-number math. (see examples below) - Erase NumberType, prim::NumToTensor, and prim::TensorToNum for ONNX export ### Tensor/number math ``` import torch @torch.jit.script def fn(x): return x + 1 ``` ``` graph(%x : Dynamic) { %1 : int = prim::Constant[value={1}]() %2 : Dynamic = prim::NumToTensor(%1) %3 : Dynamic = aten::type_as(%2, %x) %4 : Dynamic = aten::add[alpha={1}](%x, %4) return (%5); } ``` ### Number/Number Math ``` import torch @torch.jit.script def fn(zero): c = 1 + 1 return zero + c ``` ``` graph(%zero : Dynamic) { %1 : int = prim::Constant[value={1}]() %2 : int = prim::Constant[value={1}]() %3 : Dynamic = prim::num_to_tensor(%1) %4 : Dynamic = prim::num_to_tensor(%2) %5 : Dynamic = aten::add[alpha={1}](%3, %4) %c : int = prim::TensorToNum(%6) # this is the result of the addition ... return (%13); } ``` List of squashed commits: * Introduce Python Number types Added: IntType, FloatType, NumberType with IntType <: NumberType FloatType <: NumberType Changed aten_schema so arguments have corresponding types * Emit a NumberType for python literals. Also emit a NumberType for Scalar default values. * Add prim::NumToTensor and prim::TensorToNum * Add DynamicType -> NumberType implicit cast for bc * Better ensureTensor error message * Add ensureTensorOrNumber. Allow passing Number to some functions Like the range() construct and slices * Patch IntList to work. IntList is still a DynamicType in the frontend: a tensor gets built from a List[int]. Also, IntList[1] is a "union between int and IntList" the way it is implemented. If the frontend sees an int being passed for an IntList[1] arg, it converts it to a tensor as well. * Enforce some order on schemas to avoid overload ambiguity add(Tensor, Tensor) should appear earlier than add(Tensor, Scalar). This matches the order in which python_arg_parser parses its arguments. * Disable std_dim and var_dim tests. With the new schema information, std(input, keepdim) and std(input, dim) are ambiguous. This will need to be fixed at a later date. * Add NumberType erasure pass. This is used for ONNX export and to ensure that NumberType information doesn't reach the interpreter * Add support for mixed tensor/number math ops. * Tests for new functionality. Includes: - Tensor/number math - number/number math - EraseNumberTypes pass test * Patch tests Update expect tests for: - decompose_addmm - loop unrolling tests Because python numbers are now NumberType, they cannot be returned by functions anymore. Work around this by using "torch.full", or by adding a tensor([0]) (taken from FIXME_zerol()). Both approaches are used because torch.full is more readable, but it is broken in some cases. * Add erase_number_types to torch/CMakeLists.txt * Move math back to emitSimpleExpr from emitSugaredExpr * Remove some dead lines * Renable some excluded script/trace tests that are fixed. * Move some tests to expected failure * Address some comments (more addressing to come) * Erase relevant aten::type_as nodes in EraseNumberTypes I also changed it so that EraseNumberTypes is only called for ONNX export. It is no longer used to prevent prim::NumToTensor/prim::TensorToNum from reaching shape_analysis or interpreter.cpp. shape_analysis infers the type of the output of these nodes to be the same as their input. intepreter.cpp treats both of these nodes as no-ops. * Add reminder to fix std/var * Call EraseNumberTypes only when exporting a script module * Update expects after rebase	2018-06-21 15:43:38 -04:00
anderspapitto	48e90e3339	Build system changes (#8627 ) * All changes needed to get rid of process_github.sh * allow thnn_h_path	2018-06-20 17:45:26 -04:00
Teng Li	61c96811be	[c10d] NCCL python binding and CI test, with bug fixes (#8357 ) * [c10d] NCCL python binding and CI test, with bug fixes * Addressed comments and further bug fix * Made NCCL build optional, made C10D libc10d.a only * Fixed tests so that NCCL pg won't run when not neeeded * Addressed comments	2018-06-19 13:02:39 -07:00
cpuhrsch	05c473b85c	Temporarily remove TBB (#8255 )	2018-06-18 19:31:57 -04:00
Peter Goldsborough	372d1d6735	Create ATen tensors via TensorOptions (#7869 ) * Created TensorOptions Storing the type in TensorOptions to solve the Variable problem Created convenience creation functions for TensorOptions and added tests Converted zeros to TensorOptions Converted rand to TensorOptions Fix codegen for TensorOptions and multiple arguments Put TensorOptions convenience functions into torch namespace too All factory functions except _like support TensorOptions Integrated with recent JIT changes Support _like functions Fix in place modification Some cleanups and fixes Support sparse_coo_tensor Fix bug in Type.cpp Fix .empty calls in C++ API Fix bug in Type.cpp Trying to fix device placement Make AutoGPU CPU compatible Remove some auto_gpu.h uses Fixing some headers Fix some remaining CUDA/AutoGPU issues Fix some AutoGPU uses Fixes to dispatch_tensor_conversion Reset version of new variables to zero Implemented parsing device strings Random fixes to tests Self review cleanups flake8 Undo changes to variable.{h,cpp} because they fail on gcc7.2 Add [cuda] tag to tensor_options_cuda.cpp Move AutoGPU::set_index_from into .cpp file because Windows is stupid and sucks Fix linker error in AutoGPU.cpp Fix bad merge conflict in native_functions.yaml Fixed caffe2/contrib/aten Fix new window functions added to TensorFactories.cpp * Removed torch::TensorOptions Added code to generate wrapper functions for factory methods Add implicit constructor from Backend to TensorOptions Remove Var() from C++ API and use torch:: functions Use torch:: functions more subtly in C++ API Make AutoGPU::set_device more exception safe Check status directly in DynamicCUDAHooksInterface Rename AutoGPU to DeviceGuard Removed set_requires_grad from python_variables.h and warn appropriately in Variable::set_requires_grad remove python_default_init: self.type() Add back original factory functions, but with deprecation warnings Disable DeviceGuard for a couple functions in ATen Remove print statement Fix DeviceGuard construction from undefined tensor Fixing CUDA device compiler issues Moved as many methods as possible into header files Dont generate python functions for deprecated factories Remove merge conflict artefact Fix tensor_options_cuda.cpp Fix set_requires_grad not being checked Fix tensor_new.h TEMPORARILY put some methods in .cpp files to see if it solves issues on windows and mac Fix bug in DeviceGuard.h Missing includes TEMPORARILY moving a few more methods into .cpp to see if it fixes windows Fixing linker errors * Fix up SummaryOps to use new factories Undo device agnostic behavior of DeviceGuard Use -1 instead of optional for default device index Also move DeviceGuard methods into header Fixes around device index after optional -> int32_t switch Fix use of DeviceGuard in new_with_tensor_copy Fix tensor_options.cpp * Fix Type::copy( * Remove test_non_float_params from ONNX tests * Set requires_grad=False in ONNX tests that use ints * Put layout/dtype/device on Tensor * Post merge fixes * Change behavior of DeviceGuard to match AutoGPU * Fix C++ API integration tests * Fix flip functions	2018-06-16 00:40:35 -07:00
Tongzhou Wang	c537fd7432	fix lint (#8567 )	2018-06-15 17:34:39 -04:00
Soumith Chintala	dc186cc9fe	Remove NO_* and WITH_* across codebase, except in setup.py (#8555 ) * remove legacy options from CMakeLists * codemod WITH_ to USE_ for WITH_CUDA, WITH_CUDNN, WITH_DISTRIBUTED, WITH_DISTRIBUTED_MW, WITH_GLOO_IBVERBS, WITH_NCCL, WITH_ROCM, WITH_NUMPY * cover SYSTEM_NCCL, MKLDNN, NNPACK, C10D, NINJA * removed NO_* variables and hotpatch them only in setup.py * fix lint	2018-06-15 12:29:48 -04:00

... 4 5 6 7 8 ...

862 Commits