pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Can Balioglu	65e6194aeb	Introduce the torchrun entrypoint (#64049 ) Summary: This PR introduces a new `torchrun` entrypoint that simply "points" to `python -m torch.distributed.run`. It is shorter and less error-prone to type and gives a nicer syntax than a rather cryptic `python -m ...` command line. Along with the new entrypoint the documentation is also updated and places where `torch.distributed.run` are mentioned are replaced with `torchrun`. cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse agolynski SciPioneer H-Huang mrzzd cbalioglu gcramer23 Pull Request resolved: https://github.com/pytorch/pytorch/pull/64049 Reviewed By: cbalioglu Differential Revision: D30584041 Pulled By: kiukchung fbshipit-source-id: d99db3b5d12e7bf9676bab70e680d4b88031ae2d	2021-08-26 20:17:48 -07:00
Peter Bell	560cd88195	Kill THCUNN (#63429 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63429 Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D30441308 Pulled By: ngimel fbshipit-source-id: 3ae342a2f8d5c7f8827b637c4055c5d1b0a1be26	2021-08-23 12:07:16 -07:00
Nikita Shulga	6e5d065b2b	Add pocketfft as submodule (#62841 ) Summary: Using https://github.com/mreineck/pocketfft Also delete explicit installation of pocketfft during the build as it will be available via submodule Limit PocketFFT support to cmake-3.10 or newer, as `set_source_files_properties` does not seem to work as expected with cmake-3.5 Partially addresses https://github.com/pytorch/pytorch/issues/62821 Pull Request resolved: https://github.com/pytorch/pytorch/pull/62841 Reviewed By: seemethere Differential Revision: D30140441 Pulled By: malfet fbshipit-source-id: d1a1cf1b43375321f5ec5b3d0b538f58082f7825	2021-08-17 15:29:56 -07:00
Shen Li	1022443168	Revert D30279364: [codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: revert-hammer Differential Revision: D30279364 (`b004307252`) Original commit changeset: c1ed77dfe43a fbshipit-source-id: eab50857675c51e0088391af06ec0ecb14e2347e	2021-08-12 11:45:01 -07:00
Zsolt Dollenstein	b004307252	[codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: manual inspection & sandcastle Reviewed By: zertosh Differential Revision: D30279364 fbshipit-source-id: c1ed77dfe43a3bde358f92737cd5535ae5d13c9a	2021-08-12 10:58:35 -07:00
Can Balioglu	7565039ee9	Support system-provided Intel TBB (#61934 ) Summary: This PR: (1) enables the use of a system-provided Intel TBB for building PyTorch, (2) removes `tbb:task_scheduler_init` references since it has been removed from TBB a while ago (3) marks the implementation of `_internal_set_num_threads` with a TODO as it requires a revision that fixes its thread allocation logic. Tested with `test/run_test`; no new tests are introduced since there are no behavioral changes (removal of `tbb::task_scheduler_init` has no impact on the runtime behavior). Pull Request resolved: https://github.com/pytorch/pytorch/pull/61934 Reviewed By: malfet Differential Revision: D29805416 Pulled By: cbalioglu fbshipit-source-id: 22042b428b57b8fede9dfcc83878d679a19561dd	2021-08-02 07:39:00 -07:00
imaginary-person	9e53c823b8	Add AVX512 support in ATen & remove AVX support (#61903 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61903 ### Remaining Tasks - [ ] Collate results of benchmarks on two Intel Xeon machines (with & without CUDA, to check if CPU throttling causes issues with GPUs) - make graphs, including Roofline model plots (Intel Advisor can't make them with libgomp, though, but with Intel OpenMP). ### Summary 1. This draft PR produces binaries with with 3 types of ATen kernels - default, AVX2, AVX512 . Using the environment variable `ATEN_AVX512_256=TRUE` also results in 3 types of kernels, but the compiler can use 32 ymm registers for AVX2, instead of the default 16. ATen kernels for `CPU_CAPABILITY_AVX` have been removed. 2. `nansum` is not using AVX512 kernel right now, as it has poorer accuracy for Float16, than does AVX2 or DEFAULT, whose respective accuracies aren't very good either (#59415). It was more convenient to disable AVX512 dispatch for all dtypes of `nansum` for now. 3. On Windows , ATen Quantized AVX512 kernels are not being used, as quantization tests are flaky. If `--continue-through-failure` is used, then `test_compare_model_outputs_functional_static` fails. But if this test is skipped, `test_compare_model_outputs_conv_static` fails. If both these tests are skipped, then a third one fails. These are hard to debug right now due to not having access to a Windows machine with AVX512 support, so it was more convenient to disable AVX512 dispatch of all ATen Quantized kernels on Windows for now. 4. One test is currently being skipped - [test_lstm` in `quantization.bc](https://github.com/pytorch/pytorch/issues/59098) - It fails only on Cascade Lake machines, irrespective of the `ATEN_CPU_CAPABILITY` used, because FBGEMM uses `AVX512_VNNI` on machines that support it. The value of `reduce_range` should be used as `False` on such machines. The list of the changes is at https://gist.github.com/imaginary-person/4b4fda660534f0493bf9573d511a878d. Credits to ezyang for proposing `AVX512_256` - these use AVX2 intrinsics but benefit from 32 registers, instead of the 16 ymm registers that AVX2 uses. Credits to limo1996 for the initial proposal, and for optimizing `hsub_pd` & `hadd_pd`, which didn't have direct AVX512 equivalents, and are being used in some kernels. He also refactored `vec/functional.h` to remove duplicated code. Credits to quickwritereader for helping fix 4 failing complex multiplication & division tests. ### Testing 1. `vec_test_all_types` was modified to test basic AVX512 support, as tests already existed for AVX2. Only one test had to be modified, as it was hardcoded for AVX2. 2. `pytorch_linux_bionic_py3_8_gcc9_coverage_test1` & `pytorch_linux_bionic_py3_8_gcc9_coverage_test2` are now using `linux.2xlarge` instances, as they support AVX512. They were used for testing AVX512 kernels, as AVX512 kernels are being used by default in both of the CI checks. Windows CI checks had already been using machines with AVX512 support. ### Would the downclocking caused by AVX512 pose an issue? I think it's important to note that AVX2 causes downclocking as well, and the additional downclocking caused by AVX512 may not hamper performance on some Skylake machines & beyond, because of the double vector-size. I think that [this post with verifiable references is a must-read](https://community.intel.com/t5/Software-Tuning-Performance/Unexpected-power-vs-cores-profile-for-MKL-kernels-on-modern-Xeon/m-p/1133869/highlight/true#M6450). Also, AVX512 would _probably not_ hurt performance on a high-end machine, [but measurements are recommended](https://lemire.me/blog/2018/09/07/avx-512-when-and-how-to-use-these-new-instructions/). In case it does, `ATEN_AVX512_256=TRUE` can be used for building PyTorch, as AVX2 can then use 32 ymm registers instead of the default 16. [FBGEMM uses `AVX512_256` only on Xeon D processors](https://github.com/pytorch/FBGEMM/pull/209), which are said to have poor AVX512 performance. This [official data](https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-scalable-spec-update.pdf) is for the Intel Skylake family, and the first link helps understand its significance. Cascade Lake & Ice Lake SP Xeon processors are said to be even better when it comes to AVX512 performance. Here is the corresponding data for [Cascade Lake](https://cdrdv2.intel.com/v1/dl/getContent/338848) - ![CASCADE LAKE AVX2](https://user-images.githubusercontent.com/76181208/120666172-ffec3f80-c451-11eb-8ea1-8933ccc12a1b.PNG) ![CASCADE LAKE AVX512](https://user-images.githubusercontent.com/76181208/120666190-04b0f380-c452-11eb-9faa-38d233c874c8.PNG) The corresponding data isn't publicly available for Intel Xeon SP 3rd gen (Ice Lake SP), but [Intel mentioned that the 3rd gen has frequency improvements pertaining to AVX512](https://newsroom.intel.com/wp-content/uploads/sites/11/2021/04/3rd-Gen-Intel-Xeon-Scalable-Platform-Press-Presentation-281884.pdf). Ice Lake SP machines also have 48 KB L1D caches, so that's another reason for AVX512 performance to be better on them. ### Is PyTorch always faster with AVX512? No, but then PyTorch is not always faster with AVX2 either. Please refer to #60202. The benefit from vectorization is apparent with with small tensors that fit in caches or in kernels that are more compute heavy. For instance, AVX512 or AVX2 would yield no benefit for adding two 64 MB tensors, but adding two 1 MB tensors would do well with AVX2, and even more so with AVX512. It seems that memory-bound computations, such as adding two 64 MB tensors can be slow with vectorization (depending upon the number of threads used), as the effects of downclocking can then be observed. Original pull request: https://github.com/pytorch/pytorch/pull/56992 Reviewed By: soulitzer Differential Revision: D29266289 Pulled By: ezyang fbshipit-source-id: 2d5e8d1c2307252f22423bbc14f136c67c3e6184	2021-07-22 08:51:49 -07:00
zhouzhuojie	6107cf3750	Add --jobs 0 for git submodule update (#61311 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61311 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61152 Some related docs about `submodule.fetchJobs` https://git-scm.com/docs/git-config#Documentation/git-config.txt-submodulefetchJobs ``` time git submodule update --init --recursive ________________________________________________________ Executed in 243.20 secs fish external usr time 49.64 secs 213.00 micros 49.64 secs sys time 29.27 secs 795.00 micros 29.27 secs ``` ``` time git submodule update --init --recursive --jobs 4 ________________________________________________________ Executed in 143.04 secs fish external usr time 51.06 secs 246.00 micros 51.06 secs sys time 30.96 secs 742.00 micros 30.96 secs ``` ``` time git submodule update --init --recursive --jobs 8 ________________________________________________________ Executed in 124.64 secs fish external usr time 51.76 secs 264.00 micros 51.76 secs sys time 30.49 secs 739.00 micros 30.49 secs ``` ``` time git submodule update --init --recursive --jobs 0 # use all online cpus ________________________________________________________ Executed in 129.75 secs fish external usr time 51.64 secs 181.00 micros 51.64 secs sys time 31.49 secs 781.00 micros 31.49 secs ``` Test Plan: Imported from OSS Reviewed By: 1ntEgr8 Differential Revision: D29560875 Pulled By: zhouzhuojie fbshipit-source-id: 556027dffe744c66428075a8a1bf64683930aaaf	2021-07-07 16:28:18 -07:00
Nathan John Sircombe	bf00d26deb	Enables builds with Compute Library backend for oneDNN (#55913 ) Summary: Since v1.7, oneDNN (MKL-DNN) has supported the use of Compute Library for the Arm architeture to provide optimised convolution primitives on AArch64. This change enables the use of Compute Library in the PyTorch build. Following the approach used to enable the use of CBLAS in MKLDNN, It is enabled by setting the env vars USE_MKLDNN and USE_MKLDNN_ACL. The location of the Compute Library build must be set useing `ACL_ROOT_DIR`. This is an extension of the work in https://github.com/pytorch/pytorch/pull/50400 which added support for the oneDNN/MKL-DNN backend on AArch64. _Note: this assumes that Compute Library has been built and installed at ACL_ROOT_DIR. Compute library can be downloaded here: `https://github.com/ARM-software/ComputeLibrary`_ Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/55913 Reviewed By: ailzhang Differential Revision: D28559516 Pulled By: malfet fbshipit-source-id: 29d24996097d0a54efc9ab754fb3f0bded290005	2021-05-20 07:43:56 -07:00
Winston Smith	47c566ebb1	Rename namespace `vec256` to `vec`, struct `Vec256` to `Vectorized` (and other related classes/structs) (#58438 ) Summary: In order to make it more convenient for maintainers to review the ATen AVX512 implementation, the namespace `vec256` is being renamed to `vec` in this PR, as modifying 77 files & creating 2 new files only took a few minutes, as these changes aren't significant, so fewer files would've to be reviewed while reviewing https://github.com/pytorch/pytorch/issues/56992. The struct `Vec256` is not being renamed to `Vec`, but `Vectorized` instead, because there are some `using Vec=` statements in the codebase, so renaming it to `Vectorized` was more convenient. However, I can still rename it to `Vec`, if required. ### Changes made in this PR - Created `aten/src/ATen/cpu/vec` with subdirectory `vec256` (vec512 would be added via https://github.com/pytorch/pytorch/issues/56992). The changes were made in this manner - 1. First, a script was run to rename `vec256` to `vec` & `Vec` to `Vectorized` - ``` # Ref: https://stackoverflow.com/a/20721292 cd aten/src grep -rli 'vec256\/vec256\.h' * \| xargs -i@ sed -i 's/vec256\/vec256\.h/vec\/vec\.h/g' @ grep -rli 'vec256\/functional\.h' * \| xargs -i@ sed -i 's/vec256\/functional\.h/vec\/functional\.h/g' @ grep -rli 'vec256\/intrinsics\.h' * \| xargs -i@ sed -i 's/vec256\/intrinsics\.h/vec\/vec256\/intrinsics\.h/g' @ grep -rli 'namespace vec256' * \| xargs -i@ sed -i 's/namespace vec256/namespace vec/g' @ grep -rli 'Vec256' * \| xargs -i@ sed -i 's/Vec256/Vectorized/g' @ grep -rli 'vec256\:\:' * \| xargs -i@ sed -i 's/vec256\:\:/vec\:\:/g' @ grep -rli 'at\:\:vec256' * \| xargs -i@ sed -i 's/at\:\:vec256/at\:\:vec/g' @ cd ATen/cpu mkdir vec mv vec256 vec cd vec/vec256 grep -rli 'cpu\/vec256\/' * \| xargs -i@ sed -i 's/cpu\/vec256\//cpu\/vec\/vec256\//g' @ grep -rli 'vec\/vec\.h' * \| xargs -i@ sed -i 's/vec\/vec\.h/vec\/vec256\.h/g' @ ``` 2. `vec256` & `VEC256` were replaced with `vec` & `VEC` respectively in 4 CMake files. 3. In `pytorch_vec/aten/src/ATen/test/`, `vec256_test_all_types.h` & `vec256_test_all_types.cpp` were renamed. 4. `pytorch_vec/aten/src/ATen/cpu/vec/vec.h` & `pytorch_vec/aten/src/ATen/cpu/vec/functional.h` were created. Both currently have one line each & would have 5 when AVX512 support would be added for ATen. Pull Request resolved: https://github.com/pytorch/pytorch/pull/58438 Reviewed By: malfet Differential Revision: D28509615 Pulled By: ezyang fbshipit-source-id: 63840df5f23b3b59e203d25816e2977c6a901780	2021-05-19 16:04:36 -07:00
Xiang Gao	6c70cbedb6	step 0 of cuDNN v8 convolution API integration (#51390 ) Summary: This PR is step 0 of adding PyTorch convolution bindings using the cuDNN frontend. The cuDNN frontend is the recommended way of using cuDNN v8 API. It is supposed to have faster release cycles, so that, for example, if people find a specific kernel has a bug, they can report it, and that kernel will be blocked in the cuDNN frontend and frameworks could just update that submodule without the need for waiting for a whole cuDNN release. The work is not complete, and this PR is only step 0. What this PR does: - Add cudnn-frontend as a submodule. - Modify cmake to build that submodule. - Add bindings for convolution forward in `Conv_v8.cpp`, which is disabled by a macro by default. - Tested manually by enabling the macro and run `test_nn.py`. All tests pass except those mentioned below. What this PR doesn't: - Only convolution forward, no backward. The backward will use v7 API. - No 64bit-indexing support for some configuration. This is a known issue of cuDNN, and will be fixed in a later cuDNN version. PyTorch will not implement any workaround for issue, but instead, v8 API should be disabled on problematic cuDNN versions. - No test beyond PyTorch's unit tests. - Not tested for correctness on real models. - Not benchmarked for performance. - Benchmark cache is not thread-safe. (This is marked as `FIXME` in the code, and will be fixed in a follow-up PR) - cuDNN benchmark is not supported. - There are failing tests, which will be resolved later: ``` FAILED test/test_nn.py::TestNNDeviceTypeCUDA::test_conv_cudnn_nhwc_cuda_float16 - AssertionError: False is not true : Tensors failed to compare as equal!With rtol=0.001 and atol=1e-05, found 32 element(s) (out of 32) whose difference(s) exceeded the margin of error (in... FAILED test/test_nn.py::TestNNDeviceTypeCUDA::test_conv_cudnn_nhwc_cuda_float32 - AssertionError: False is not true : Tensors failed to compare as equal!With rtol=1.3e-06 and atol=1e-05, found 32 element(s) (out of 32) whose difference(s) exceeded the margin of error (... FAILED test/test_nn.py::TestNNDeviceTypeCUDA::test_conv_large_cuda - RuntimeError: CUDNN_BACKEND_OPERATION: cudnnFinalize Failed cudnn_status: 9 FAILED test/test_nn.py::TestNN::test_Conv2d_depthwise_naive_groups_cuda - AssertionError: False is not true : Tensors failed to compare as equal!With rtol=0 and atol=1e-05, found 64 element(s) (out of 64) whose difference(s) exceeded the margin of error (including 0 an... FAILED test/test_nn.py::TestNN::test_Conv2d_deterministic_cudnn - RuntimeError: not supported yet FAILED test/test_nn.py::TestNN::test_ConvTranspose2d_groups_cuda_fp32 - RuntimeError: cuDNN error: CUDNN_STATUS_BAD_PARAM FAILED test/test_nn.py::TestNN::test_ConvTranspose2d_groups_cuda_tf32 - RuntimeError: cuDNN error: CUDNN_STATUS_BAD_PARAM ``` Although this is not a complete implementation of cuDNN v8 API binding, I still want to merge this first. This would allow me to do small and incremental work, for the ease of development and review. Pull Request resolved: https://github.com/pytorch/pytorch/pull/51390 Reviewed By: malfet Differential Revision: D28513167 Pulled By: ngimel fbshipit-source-id: 9cc20c9dec5bbbcb1f94ac9e0f59b10c34f62740	2021-05-19 12:54:09 -07:00
davidriazati@fb.com	c44cbc63cc	Ignore more compiler warnings, unify WERROR options (#56630 ) Summary: This adds some more compiler warnings ignores for everything that happens on a standard CPU build (CUDA builds still have a bunch of warnings so we can't turn on `-Werror` everywhere yet). ](https://our.intern.facebook.com/intern/diff/28005063/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/56630 Pulled By: driazati Reviewed By: malfet Differential Revision: D28005063 fbshipit-source-id: 541ed415eb0470ddf7e08c22c5eb6da9db26e9a0	2021-04-29 21:20:29 -07:00
davidriazati@fb.com	4b96fc060b	Remove distutils (#57040 ) Summary: [distutils](https://docs.python.org/3/library/distutils.html) is on its way out and will be deprecated-on-import for Python 3.10+ and removed in Python 3.12 (see [PEP 632](https://www.python.org/dev/peps/pep-0632/)). There's no reason for us to keep it around since all the functionality we want from it can be found in `setuptools` / `sysconfig`. `setuptools` includes a copy of most of `distutils` (which is fine to use according to the PEP), that it uses under the hood, so this PR also uses that in some places. Fixes #56527 Pull Request resolved: https://github.com/pytorch/pytorch/pull/57040 Pulled By: driazati Reviewed By: nikithamalgifb Differential Revision: D28051356 fbshipit-source-id: 1ca312219032540e755593e50da0c9e23c62d720	2021-04-29 12:10:11 -07:00
David Reiss	89377e3e45	model_dump tool for model inspection (#56868 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56868 See __init__.py for a summary of the tool. The following sections are present in this initial version - Model Size. Show the total model size, as well as a breakdown by stored files, compressed files, and zip overhead. (I expect this breakdown to be a bit more useful once data.pkl is compressed.) - Model Structure. This is basically the output of `show_pickle(data.pkl)`, but as a hierarchical structure. Some structures cause this view to crash right now, but it can be improved incrementally. - Zip Contents. This is basically the output of `zipinfo -l`. - Code. This is the TorchScript code. It's integrated with a blame window at the bottom, so you can click "Blame Code", then click a bit of code to see where it came from (based on the debug_pkl). This currently doesn't render properly if debug_pkl is missing or incomplete. - Extra files (JSON). JSON dumps of each json file under /extra/, up to a size limit. - Extra Pickles. For each .pkl file in the model, we safely unpickle it with `show_pickle`, then render it with `pprint` and include it here if the size is not too large. We aren't able to install the pprint hack that thw show_pickle CLI uses, so we get one-line rendering for custom objects, which is not very useful. Built-in types look fine, though. In particular, bytecode.pkl seems to look fine (and we hard-code that file to ignore the size limit). I'm checking in the JS dependencies to avoid a network dependency at runtime. They were retrieved from the following URLS, then passed through a JS minifier: https://unpkg.com/htm@3.0.4/dist/htm.module.js?module https://unpkg.com/preact@10.5.13/dist/preact.module.js?module Test Plan: Manually ran on a few models I had lying around. Mostly tested in Chrome, but I also poked around in Firefox. Reviewed By: dhruvbird Differential Revision: D28020849 Pulled By: dreiss fbshipit-source-id: 421c30ed7ca55244e9fda1a03b8aab830466536d	2021-04-28 07:33:10 -07:00
Bert Maher	90f848572c	NNC depthwise conv2d implementation (#54920 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54920 Add a depthwise convolution implementation and reasonably good schedules for 3x3 stride=1,2. ghstack-source-id: 126076113 Test Plan: new tensorexpr test: Conv.DepthwiseConv2D Reviewed By: ZolotukhinM Differential Revision: D27413745 fbshipit-source-id: 833da6072b655fbe2b679704e9d56a08e1bf7e7e	2021-04-08 21:56:53 -07:00
Nikita Shulga	14a2501786	Update max-version in setup.py to 3.9 (#54690 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54690 Reviewed By: seemethere Differential Revision: D27330462 Pulled By: malfet fbshipit-source-id: db332acf5aa5bff67af2bef777935f2387bc963c	2021-03-26 12:45:03 -07:00
Nikita Shulga	e8e570e9c5	[MacOS] Cross compile stub when building for M1 on x86 (#54046 ) Summary: Also rename `CROSS_COMPILE_ARM` to `CROSS_COMPILE_ARM64` Pull Request resolved: https://github.com/pytorch/pytorch/pull/54046 Reviewed By: walterddr Differential Revision: D27071928 Pulled By: malfet fbshipit-source-id: 9143cd5d110ed67f0609f0a4bbb20922012ee665	2021-03-16 00:24:09 -07:00
James Butterworth	37ab711822	Adding learning rate schedulers to C++ API (#52268 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/50577 Learning rate schedulers had not yet been implemented for the C++ API. This pull request introduces the learning rate scheduler base class and the StepLR subclass. Furthermore, it modifies the existing OptimizerOptions such that the learning rate scheduler can modify the learning rate. Pull Request resolved: https://github.com/pytorch/pytorch/pull/52268 Reviewed By: mrshenli Differential Revision: D26818387 Pulled By: glaringlee fbshipit-source-id: 2b28024a8ea7081947c77374d6d643fdaa7174c1	2021-03-10 23:09:51 -08:00
Nikita Shulga	7e6a84d238	Add logic to auto-fetch submodules (#53461 ) Summary: In setup.py add logic to: - Get list of submodules from .gitmodules file - Auto-fetch submodules if none of them has been fetched In CI: - Test this on non-docker capable OSes (Windows and Mac) - Use shallow submodule checkouts whenever possible Pull Request resolved: https://github.com/pytorch/pytorch/pull/53461 Reviewed By: ezyang Differential Revision: D26871119 Pulled By: malfet fbshipit-source-id: 8b23d6a4fcf04446eac11446e0113819476ef6ea	2021-03-09 09:13:35 -08:00
Andrew Millspaugh	1fc8831322	Add missing tensor header (#53489 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53489 It appears that D26675801 (`1fe6a6507e`) broke Glow builds (and probably other instals) with the inclusion of the python_arg_parser include. That dep lives in a directory of its own and was not included in the setup.py. Test Plan: OSS tests should catch this. Reviewed By: ngimel Differential Revision: D26878180 fbshipit-source-id: 70981340226a9681bb9d5420db56abba75e7f0a5	2021-03-08 12:05:17 -08:00
Rong Rong (AI Infra)	f58f7b786c	add distributed backend options in setup.py (#53214 ) Summary: Currently there's only one indicator for build_ext regarding distributed backend `USE_DISTRIBUTED`. However one can build with selective backends. adding the 3 distributed backend option in setup.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/53214 Test Plan: Set the 3 options in environment and locally ran `python setup.py build_ext` Reviewed By: janeyx99 Differential Revision: D26818259 Pulled By: walterddr fbshipit-source-id: 688e8f83383d10ce23ee1f019be33557ce5cce07	2021-03-05 14:39:36 -08:00
Nikita Shulga	272dfc7bb9	Add MANIFEST.in (#52908 ) Summary: Do not build PyTorch if `setup.py` is called with 'sdist' option Regenerate bundled license while sdist package is being built Refactor `check_submodules` out of `build_deps` and check that submodules project are present during source package build stage. Test that sdist package is configurable during `asan-build` step Fixes https://github.com/pytorch/pytorch/issues/52843 Pull Request resolved: https://github.com/pytorch/pytorch/pull/52908 Reviewed By: walterddr Differential Revision: D26685176 Pulled By: malfet fbshipit-source-id: 972a40ae36e194c0b4e0fc31c5e1af1e7a815185	2021-03-01 18:28:25 -08:00
Nikita Shulga	a0a1bb074b	Make NumPy dependency dynamic (#52794 ) Summary: Move NumPy initialization from `initModule()` to singleton inside `torch::utils::is_numpy_available()` function. This singleton will print a warning, that NumPy integration is not available, rather than fails to import torch altogether. The warning be printed only once, and will look something like the following: ``` UserWarning: Failed to initialize NumPy: No module named 'numpy.core' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:66.) ``` This is helpful if PyTorch was compiled with wrong NumPy version, of NumPy is not commonly available on the platform (which is often the case on AARCH64 or Apple M1) Test that PyTorch is usable after numpy is uninstalled at the end of `_test1` CI config. Pull Request resolved: https://github.com/pytorch/pytorch/pull/52794 Reviewed By: seemethere Differential Revision: D26650509 Pulled By: malfet fbshipit-source-id: a2d98769ef873862c3704be4afda075d76d3ad06	2021-02-25 19:45:00 -08:00
mattip	9cbefad83f	concantenate LICENSE files when building a wheel (#51634 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/50695 I checked locally that the concatenated license file appears at `torch-<version>.dist-info/LICENSE` in the wheel. Pull Request resolved: https://github.com/pytorch/pytorch/pull/51634 Reviewed By: zhangguanheng66 Differential Revision: D26225550 Pulled By: walterddr fbshipit-source-id: 830c59fb7aea0eb50b99e295edddad9edab6ba3a	2021-02-08 08:28:46 -08:00
Ilia Cherniavskii	e34992ebee	Set USE_KINETO=1 (#49897 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49897 Resend of https://github.com/pytorch/pytorch/pull/49201 Test Plan: see 49201 Reviewed By: malfet Differential Revision: D25717102 Pulled By: ilia-cher fbshipit-source-id: 5e794a7f5fe160ca64ac9d190c4fd3e8f1e443e6	2021-01-22 00:09:21 -08:00
Richard Barnes	a5339b9d7c	Drop unused imports from leftovers (#49953 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49953 From ``` ./python/libcst/libcst codemod remove_unused_imports.RemoveUnusedImportsWithGlean --no-format caffe2/ ``` Test Plan: Standard sandcastle tests Reviewed By: xush6528 Differential Revision: D25727348 fbshipit-source-id: b3feef80b9b4b535f1bd4060dace5b1a50bd5e69	2021-01-04 16:31:48 -08:00
Protonu Basu	4c5a4dbb8c	[Tensorexpr]Copying header files in tensorexpr dir (#49933 ) Summary: Previously header files from jit/tensorexpr were not copied, this PR should enable copying. This will allow other OSS projects like Glow to used TE. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49933 Reviewed By: Krovatkin, mruberry Differential Revision: D25725927 Pulled By: protonu fbshipit-source-id: 9d5a0586e9b73111230cacf044cd7e8f5c600ce9	2020-12-29 15:18:52 -08:00
Ilia Cherniavskii	72b00a8a52	Revert D25480770: Set USE_KINETO=1 Test Plan: revert-hammer Differential Revision: D25480770 (`1a92802bde`) Original commit changeset: 037cd774f554 fbshipit-source-id: 6a6062195033ca91fcc0cfa1e890e47efc774ac1	2020-12-18 07:06:28 -08:00
Ilia Cherniavskii	1a92802bde	Set USE_KINETO=1 (#49201 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49201 This unblocks kineto profiler for 1.8 release. This PR supercedes https://github.com/pytorch/pytorch/pull/48391 Note: this will somewhat increase the size of linux server binaries, bc we add libkineto.a and libcupti_static.a: -rw-r--r-- 1 jenkins jenkins 1107502 Dec 10 21:16 build/lib/libkineto.a -rw-r--r-- 1 root root 13699658 Nov 13 2019 /usr/local/cuda/lib64/libcupti_static.a Test Plan: CI https://github.com/pytorch/pytorch/pull/48391 Imported from OSS Reviewed By: ngimel Differential Revision: D25480770 fbshipit-source-id: 037cd774f5547d9918d6055ef5cc952a54e48e4c	2020-12-18 01:48:10 -08:00
Taylor Robie	0225d3dc9d	Add support for timing C++ snippets. (#47864 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47864 Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D25199262 Pulled By: robieta fbshipit-source-id: 1c2114628ed543fba4f403bf49c065f4d71388e2	2020-12-01 20:03:14 -08:00
Taylor Robie	17ea11259a	Rework compat bindings. (#47863 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47863 Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D25199261 Pulled By: robieta fbshipit-source-id: 0a4a0409ddb75c1bf66cd31d67b55080227b1679	2020-12-01 20:03:11 -08:00
Nikita Shulga	2dff0b3e91	Fix typos in comments (#48316 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48316 Reviewed By: walterddr, mrshenli Differential Revision: D25125123 Pulled By: malfet fbshipit-source-id: 6f31e5456cc078cc61b288191f1933711acebba0	2020-11-24 10:56:40 -08:00
Ilia Cherniavskii	f2da18af14	Add USE_KINETO build option (#45888 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45888 Adding USE_LIBKINETO build option Test Plan: USE_KINETO=1 USE_CUDA=1 USE_MKLDNN=1 BLAS=MKL BUILD_BINARY=1 python setup.py develop install --cmake Reviewed By: Chillee Differential Revision: D25142221 Pulled By: ilia-cher fbshipit-source-id: d1634a8f9599604ff511fac59b9072854289510c	2020-11-21 20:20:32 -08:00
Nikita Shulga	d7c8d3cccb	Remove references to `typing` module from setup.py (#47677 ) Summary: It is part of core Python-3.6.2+ Fixes https://github.com/pytorch/pytorch/issues/47596 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47677 Reviewed By: walterddr Differential Revision: D24860188 Pulled By: malfet fbshipit-source-id: ad72b433a4493ebe5caca97c2e8a9d4b3c8172d4	2020-11-12 10:04:38 -08:00
peter	a08e8dd70c	Fix python 3.9 builds on Windows (#47602 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/47460. Pull Request resolved: https://github.com/pytorch/pytorch/pull/47602 Reviewed By: heitorschueroff Differential Revision: D24832487 Pulled By: malfet fbshipit-source-id: 8846caeac5e767e8066470d5c981218f147c88dc	2020-11-09 12:39:28 -08:00
Nikita Shulga	6f6025183f	Skip iomp5 emebedding if torch_cpu could not be found (#47390 ) Summary: This would be the case when package is build for local development rather than for installation Pull Request resolved: https://github.com/pytorch/pytorch/pull/47390 Reviewed By: janeyx99 Differential Revision: D24738416 Pulled By: malfet fbshipit-source-id: 22bd676bc46e5d50a09539c969ce56d37cfe5952	2020-11-04 14:22:53 -08:00
Nikita Shulga	3a0024574d	Do not delete rpath from torch.dylib on Darwin (#47337 ) Summary: Fixes CI regressions introduced by https://github.com/pytorch/pytorch/issues/47262 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47337 Reviewed By: ngimel Differential Revision: D24721954 Pulled By: malfet fbshipit-source-id: 395b037b29c0fc3b62ca50bba9be940ad72e0c5b	2020-11-03 22:36:35 -08:00
Nikita Shulga	ca61b061f3	Update minimum supported Python version to 3.6.2 (#47314 ) Summary: As typing.NoReturn is used in the codebase Pull Request resolved: https://github.com/pytorch/pytorch/pull/47314 Reviewed By: seemethere Differential Revision: D24712847 Pulled By: malfet fbshipit-source-id: f0692d408316d630bc11f1ee881b695437fb47d4	2020-11-03 13:32:07 -08:00
Nikita Shulga	14194e4f23	Embed `libiomp5.dylib` into wheel package (#47262 ) Summary: libiomp runtime is the only external dependency OS X package has if compiled with MKL Copy it to the stage directory from one of the available rpathes And remove all absolute rpathes, since project shoudl have none Fixes https://github.com/pytorch/pytorch/issues/38607 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47262 Reviewed By: walterddr Differential Revision: D24705094 Pulled By: malfet fbshipit-source-id: 9f588a3ec3c6c836c8986d858fb53df815a506c8	2020-11-03 13:00:30 -08:00
Nikita Shulga	8c39f198b4	Fix typo in setup.py (#46921 ) Summary: Also, be a bit future-proof in support version list Pull Request resolved: https://github.com/pytorch/pytorch/pull/46921 Reviewed By: seemethere Differential Revision: D24568733 Pulled By: malfet fbshipit-source-id: ae34f8da1ed39b80dc34db0b06e4ef142104a3ff	2020-10-27 13:14:41 -07:00
Nikita Shulga	a38eeeff5c	Make setup.py python 2 friendly (#46317 ) Summary: import print_function to make setup.py invoked by Python2 print human readable error: ``` % python2 setup.py Python 2 has reached end-of-life and is no longer supported by PyTorch. ``` Also, remove `future` from the list of the PyTorch package install dependencies Pull Request resolved: https://github.com/pytorch/pytorch/pull/46317 Reviewed By: walterddr, bugra Differential Revision: D24305004 Pulled By: malfet fbshipit-source-id: 9181186170562384dd2c0e6a8ff0b1e93508f221	2020-10-14 16:37:06 -07:00
Nikita Shulga	45de2ee3ac	Remove Python version upper boundary check (#46315 ) Summary: This prevents setup.py from erroring out when Python-3.9 is used Fixes https://github.com/pytorch/pytorch/issues/46314 Pull Request resolved: https://github.com/pytorch/pytorch/pull/46315 Reviewed By: heitorschueroff Differential Revision: D24304846 Pulled By: malfet fbshipit-source-id: 573a88ea8c1572d7d8a9991539effb3c228bffc9	2020-10-14 07:36:55 -07:00
Eli Uriegas	615013edcb	setup: Dataclasses only when < 3.7 (#45844 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45844 Someone pointed out that dataclasses were actually added to the python stdlib in 3.7 and not 3.8, so bumping down the dependency on dataclasses from 3.8 -> 3.7 makes sense here Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Test Plan: Imported from OSS Reviewed By: walterddr, malfet Differential Revision: D24113367 Pulled By: seemethere fbshipit-source-id: 03d2d93f7d966d48a30a8e2545fd07dfe63b4fb3	2020-10-05 13:29:21 -07:00
Michael Suo	18253f4a48	Fix BUILD_CAFFE2 if FBGEMM and NNPACK are not built (#45610 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45610 Also add to the usual documentation places that this option exists. Test Plan: Imported from OSS Reviewed By: gmagogsfm Differential Revision: D24058199 Pulled By: suo fbshipit-source-id: 81574fbd042f47587e2c7820c726fac0f68af2a7	2020-10-01 14:58:55 -07:00
Eli Uriegas	5959de3aeb	setup: Only include dataclasses for py < 3.8 (#45611 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45611 dataclasses was made a standard library item in 3.8 Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Test Plan: Imported from OSS Reviewed By: walterddr Differential Revision: D24031740 Pulled By: seemethere fbshipit-source-id: 15bdf1fe0d8de9b8ba7912e4a651f06b18d516ee	2020-10-01 14:52:28 -07:00
Bugra Akyildiz	27c7158166	Remove __future__ imports for legacy Python2 supports (#45033 ) Summary: There is a module called `2to3` which you can target for future specifically to remove these, the directory of `caffe2` has the most redundant imports: ```2to3 -f future -w caffe2``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/45033 Reviewed By: seemethere Differential Revision: D23808648 Pulled By: bugra fbshipit-source-id: 38971900f0fe43ab44a9168e57f2307580d36a38	2020-09-23 17:57:02 -07:00
Daily, Jeff	b98ac20849	install ATen/native/cuda and hip headers (#45097 ) Summary: The ATen/native/cuda headers were copied to torch/include, but then not included in the final package. Further, add ATen/native/hip headers to the installation, as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/45097 Reviewed By: mruberry Differential Revision: D23831006 Pulled By: malfet fbshipit-source-id: ab527928185faaa912fd8cab208733a9b11a097b	2020-09-22 17:43:47 -07:00
Michael Suo	161490d441	Move `torch/version.py` generation to cmake (#44577 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44577 I would like to to move this to cmake so that I can depend on it happening from other parts of the build. This PR pulls out the logic for determining the version string and writing the version file into its own module. `setup.py` still receives the version string and uses it as before, but now the code for writing out `torch/version.py` lives in a custom command in torch/CMakeLists.txt I noticed a small inconsistency in how version info is populated. `TORCH_BUILD_VERSION` is populated from `setup.py` at configuration time, while `torch/version.py` is written at build time. So if, e.g. you configured cmake on a certain git rev, then built it in on another, the two versions would be inconsistent. This does not appear to matter, so I opted to preserve the existing behavior. Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D23734781 Pulled By: suo fbshipit-source-id: 4002c9ec8058503dc0550f8eece2256bc98c03a4	2020-09-16 15:49:22 -07:00
Alexander Grund	d23f3170ef	Remove pybind11 from required submodules (#44278 ) Summary: This can be taken from the system in which case it is not used from the submodule. Hence the check here limits the usage unnecessarily ccing malfet Pull Request resolved: https://github.com/pytorch/pytorch/pull/44278 Reviewed By: malfet Differential Revision: D23568552 Pulled By: ezyang fbshipit-source-id: 7fd2613251567f649b12eca0b1fe7663db9cb58d	2020-09-09 08:07:13 -07:00
Edward Yang	6ea89166bd	Rewrite of ATen code generator (#42629 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42629 How to approach reviewing this diff: - The new codegen itself lives in `tools/codegen`. Start with `gen.py`, then read `model.py` and them the `api/` folder. The comments at the top of the files describe what is going on. The CLI interface of the new codegen is similar to the old one, but (1) it is no longer necessary to explicitly specify cwrap inputs (and now we will error if you do so) and (2) the default settings for source and install dir are much better; to the extent that if you run the codegen from the root source directory as just `python -m tools.codegen.gen`, something reasonable will happen. - The old codegen is (nearly) entirely deleted; every Python file in `aten/src/ATen` was deleted except for `common_with_cwrap.py`, which now permanently finds its home in `tools/shared/cwrap_common.py` (previously cmake copied the file there), and `code_template.py`, which now lives in `tools/codegen/code_template.py`. We remove the copying logic for `common_with_cwrap.py`. - All of the inputs to the old codegen are deleted. - Build rules now have to be adjusted to not refer to files that no longer exist, and to abide by the (slightly modified) CLI. - LegacyTHFunctions files have been generated and checked in. We expect these to be deleted as these final functions get ported to ATen. The deletion process is straightforward; just delete the functions of the ones you are porting. There are 39 more functions left to port. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D23183978 Pulled By: ezyang fbshipit-source-id: 6073ba432ad182c7284a97147b05f0574a02f763	2020-08-31 09:00:22 -07:00

1 2 3 4 5 ...

602 Commits