pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
peterjc123	44ff79d849	Automatically set BUILD_SPLIT_CUDA for cpp exts (#52503 ) Summary: Fixes https://github.com/pytorch/vision/pull/3418#issuecomment-781673110 Pull Request resolved: https://github.com/pytorch/pytorch/pull/52503 Reviewed By: malfet Differential Revision: D26546857 Pulled By: janeyx99 fbshipit-source-id: a100b408e7cd28695145a1dda7f2fa081bb7f21f	2021-02-19 12:22:55 -08:00
Jane Xu	550c965b2e	Re-enable test_standalone_load for Windows 11.1 (#51596 ) Summary: This fixes the previous erroring out by adding stricter conditions in cpp_extension.py. To test, run a split torch_cuda build on Windows with export BUILD_SPLIT_CUDA=ON && python setup.py develop and then run the following test: python test/test_utils.py TestStandaloneCPPJIT.test_load_standalone. It should pass. Pull Request resolved: https://github.com/pytorch/pytorch/pull/51596 Reviewed By: malfet Differential Revision: D26213816 Pulled By: janeyx99 fbshipit-source-id: a752ce7f9ab9d73dcf56f952bed2f2e040614443	2021-02-03 08:58:34 -08:00
Nikita Shulga	f7313b3105	Fix Python.h discovery logic on some MacOS platforms (#51586 ) Summary: On all non-Windows platforms we should use 'posix_prefix' schema to discover location of Python.h header Pull Request resolved: https://github.com/pytorch/pytorch/pull/51586 Reviewed By: ezyang Differential Revision: D26208684 Pulled By: malfet fbshipit-source-id: bafa6d79de42231629960c642d535f1fcf7a427f	2021-02-02 21:38:37 -08:00
Jane Xu	88af2149e1	Add build option to split torch_cuda library into torch_cuda_cu and torch_cuda_cpp (#49050 ) Summary: Because of the size of our `libtorch_cuda.so`, linking with other hefty binaries presents a problem where 32bit relocation markers are too small and end up overflowing. This PR attempts to break up `torch_cuda` into `torch_cuda_cu` and `torch_cuda_cpp`. `torch_cuda_cu`: all the files previously in `Caffe2_GPU_SRCS` that are * pure `.cu` files in `aten`match * all the BLAS files * all the THC files, except for THCAllocator.cpp, THCCachingHostAllocator.cpp and THCGeneral.cpp * all files in`detail` * LegacyDefinitions.cpp and LegacyTHFunctionsCUDA.cpp * RegisterCUDA.cpp CUDAHooks.cpp * CUDASolver.cpp * TensorShapeCUDA.cpp `torch_cuda_cpp`: all other files in `Caffe2_GPU_SRCS` Accordingly, TORCH_CUDA_API and TORCH_CUDA_BUILD_MAIN_LIB usages are getting split as well to TORCH_CUDA_CU_API and TORCH_CUDA_CPP_API. To test this locally, you can run `export BUILD_SPLIT_CUDA=ON && python setup.py develop`. In your `build/lib` folder, you should find binaries for both `torch_cuda_cpp` and `torch_cuda_cu`. To see that the SPLIT_CUDA option was toggled, you can grep the Summary of running cmake and make sure `Split CUDA` is ON. This build option is tested on CI for CUDA 11.1 builds (linux for now, but windows soon). Pull Request resolved: https://github.com/pytorch/pytorch/pull/49050 Reviewed By: walterddr Differential Revision: D26114310 Pulled By: janeyx99 fbshipit-source-id: 0180f2519abb5a9cdde16a6fb7dd3171cff687a6	2021-02-01 18:42:35 -08:00
Jithun Nair	327539ca79	Fix bug in hipify if include_dirs is not specified in setup.py (#50703 ) Summary: Bugs: 1) would introduce -I* in compile commands 2) wouldn't hipify source code directly in build_dir, only one level down or more Pull Request resolved: https://github.com/pytorch/pytorch/pull/50703 Reviewed By: mrshenli Differential Revision: D25949070 Pulled By: ngimel fbshipit-source-id: 018c2a056b68019a922e20e5db2eb8435ad147fe	2021-01-19 16:30:17 -08:00
Ralf Gommers	e29082b2a6	Run mypy over test/test_utils.py (#50278 ) Summary: _resubmission of gh-49654, which was reverted due to a cross-merge conflict_ This caught one incorrect annotation in `cpp_extension.load`. xref gh-16574. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50278 Reviewed By: walterddr Differential Revision: D25865278 Pulled By: ezyang fbshipit-source-id: 25489191628af5cf9468136db36f5a0f72d9d54d	2021-01-11 08:16:23 -08:00
Rong Rong (AI Infra)	e3c56ddde6	Revert D25757691: [pytorch][PR] Run mypy over test/test_utils.py Test Plan: revert-hammer Differential Revision: D25757691 (`c86cfcd81d`) Original commit changeset: 145ce3ae532c fbshipit-source-id: 3dfd68f0c42fc074cde15c6213a630b16e9d8879	2021-01-05 13:40:13 -08:00
Ralf Gommers	c86cfcd81d	Run mypy over test/test_utils.py (#49654 ) Summary: This caught one incorrect annotation in `cpp_extension.load`. xref gh-16574. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49654 Reviewed By: heitorschueroff Differential Revision: D25757691 Pulled By: ezyang fbshipit-source-id: 145ce3ae532cc585d9ca3bbd5381401bad0072e2	2021-01-05 09:32:06 -08:00
Samuel Marks	e6779d4357	[*.py] Rename "Arguments:" to "Args:" (#49736 ) Summary: I've written custom parsers and emitters for everything from docstrings to classes and functions. However, I recently came across an issue when I was parsing/generating from the TensorFlow codebase: inconsistent use of `Args:` and `Arguments:` in its docstrings. ```sh (pytorch#c348fae)$ for name in 'Args:' 'Arguments:'; do printf '%-10s %04d\n' "$name" "$(rg -IFtpy --count-matches "$name" \| paste -s -d+ -- \| bc)"; done Args: 1095 Arguments: 0336 ``` It is easy enough to extend my parsers to support both variants, however it looks like `Arguments:` is wrong anyway, as per: - https://google.github.io/styleguide/pyguide.html#doc-function-args @ [`ddccc0f`](https://github.com/google/styleguide/blob/ddccc0f/pyguide.md) - https://chromium.googlesource.com/chromiumos/docs/+/master/styleguide/python.md#describing-arguments-in-docstrings @ [`9fc0fc0`](https://chromium.googlesource.com/chromiumos/docs/+/9fc0fc0/styleguide/python.md) - https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html @ [`c0ae8e3`](https://github.com/sphinx-contrib/napoleon/blob/c0ae8e3/docs/source/example_google.rst) Therefore, only `Args:` is valid. This PR replaces them throughout the codebase. PS: For related PRs, see tensorflow/tensorflow/pull/45420 PPS: The trackbacks automatically appearing below are sending the same changes to other repositories in the [PyTorch](https://github.com/pytorch) organisation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49736 Reviewed By: albanD Differential Revision: D25710534 Pulled By: soumith fbshipit-source-id: 61e8ff01abb433e9f78185c2d1d0cbd7c22c1619	2020-12-28 09:34:47 -08:00
Stas Bekman	60b4c40101	[extensions] fix `is_ninja_available` during cuda extension building (#49443 ) Summary: tldr: current version of `is_ninja_available` of `torch/utils/cpp_extension.py` fails to run in the recent incarnations of pip w/ new build isolation feature which is now a default. This PR fixes this problem. The full story follows: -------------------------- Currently trying to build https://github.com/facebookresearch/fairscale/ which builds cuda extensions fails with the recent pip versions. The build is failing to perform `is_ninja_available`, which runs a simple subprocess to run `ninja --version` but does it with some /dev/null stream override which seems to break with the new pip versions. Currently I have `pip==20.3.3`. The recent pip performs build isolation which first fetches all dependencies to somewhere under /tmp/pip-install-xyz and then builds the package. If I build: ``` pip install fairscale --no-build-isolation ``` everything works. When building normally (i.e. without `--no-build-isolation`), the failure is a long long trace, <details> <summary>Full log</summary> <pre> pip install fairscale Collecting fairscale Downloading fairscale-0.1.1.tar.gz (83 kB) \|████████████████████████████████\| 83 kB 562 kB/s Installing build dependencies ... done Getting requirements to build wheel ... error ERROR: Command errored out with exit status 1: command: /home/stas/anaconda3/envs/main-38/bin/python /home/stas/anaconda3/envs/main-38/lib/python3.8/site-packages/pip/_vendor/pep517/_in_process.py get_requires_for_build_wheel /tmp/tmpjvw00c7v cwd: /tmp/pip-install-1wq9f8fp/fairscale_347f218384a64f24b8d5ce846641213e Complete output (55 lines): running egg_info writing fairscale.egg-info/PKG-INFO writing dependency_links to fairscale.egg-info/dependency_links.txt writing requirements to fairscale.egg-info/requires.txt writing top-level names to fairscale.egg-info/top_level.txt Traceback (most recent call last): File "/home/stas/anaconda3/envs/main-38/bin/ninja", line 5, in <module> from ninja import ninja ModuleNotFoundError: No module named 'ninja' Traceback (most recent call last): File "/home/stas/anaconda3/envs/main-38/lib/python3.8/site-packages/pip/_vendor/pep517/_in_process.py", line 280, in <module> main() File "/home/stas/anaconda3/envs/main-38/lib/python3.8/site-packages/pip/_vendor/pep517/_in_process.py", line 263, in main json_out['return_val'] = hook(hook_input['kwargs']) File "/home/stas/anaconda3/envs/main-38/lib/python3.8/site-packages/pip/_vendor/pep517/_in_process.py", line 114, in get_requires_for_build_wheel return hook(config_settings) File "/tmp/pip-build-env-a5x2icen/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 149, in get_requires_for_build_wheel return self._get_build_requires( File "/tmp/pip-build-env-a5x2icen/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 130, in _get_build_requires self.run_setup() File "/tmp/pip-build-env-a5x2icen/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 145, in run_setup exec(compile(code, __file__, 'exec'), locals()) File "setup.py", line 56, in <module> setuptools.setup( File "/tmp/pip-build-env-a5x2icen/overlay/lib/python3.8/site-packages/setuptools/__init__.py", line 153, in setup return distutils.core.setup(attrs) File "/home/stas/anaconda3/envs/main-38/lib/python3.8/distutils/core.py", line 148, in setup dist.run_commands() File "/home/stas/anaconda3/envs/main-38/lib/python3.8/distutils/dist.py", line 966, in run_commands self.run_command(cmd) File "/home/stas/anaconda3/envs/main-38/lib/python3.8/distutils/dist.py", line 985, in run_command cmd_obj.run() File "/tmp/pip-build-env-a5x2icen/overlay/lib/python3.8/site-packages/setuptools/command/egg_info.py", line 298, in run self.find_sources() File "/tmp/pip-build-env-a5x2icen/overlay/lib/python3.8/site-packages/setuptools/command/egg_info.py", line 305, in find_sources mm.run() File "/tmp/pip-build-env-a5x2icen/overlay/lib/python3.8/site-packages/setuptools/command/egg_info.py", line 536, in run self.add_defaults() File "/tmp/pip-build-env-a5x2icen/overlay/lib/python3.8/site-packages/setuptools/command/egg_info.py", line 572, in add_defaults sdist.add_defaults(self) File "/home/stas/anaconda3/envs/main-38/lib/python3.8/distutils/command/sdist.py", line 228, in add_defaults self._add_defaults_ext() File "/home/stas/anaconda3/envs/main-38/lib/python3.8/distutils/command/sdist.py", line 311, in _add_defaults_ext build_ext = self.get_finalized_command('build_ext') File "/home/stas/anaconda3/envs/main-38/lib/python3.8/distutils/cmd.py", line 298, in get_finalized_command cmd_obj = self.distribution.get_command_obj(command, create) File "/home/stas/anaconda3/envs/main-38/lib/python3.8/distutils/dist.py", line 858, in get_command_obj cmd_obj = self.command_obj[command] = klass(self) File "/tmp/pip-build-env-a5x2icen/overlay/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 351, in __init__ if not is_ninja_available(): File "/tmp/pip-build-env-a5x2icen/overlay/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1310, in is_ninja_available subprocess.check_call('ninja --version'.split(), stdout=devnull) File "/home/stas/anaconda3/envs/main-38/lib/python3.8/subprocess.py", line 364, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['ninja', '--version']' returned non-zero exit status 1. ---------------------------------------- ERROR: Command errored out with exit status 1: /home/stas/anaconda3/envs/main-38/bin/python /home/stas/anaconda3/envs/main-38/lib/python3.8/site-packages/pip/_vendor/pep517/_in_process.py get_requires_for_build_wheel /tmp/tmpjvw00c7v Check the logs for full command output. </pre> </details> and the middle of it is what we want: ``` File "/tmp/pip-build-env-a5x2icen/overlay/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 351, in __init__ if not is_ninja_available(): File "/tmp/pip-build-env-a5x2icen/overlay/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1310, in is_ninja_available subprocess.check_call('ninja --version'.split(), stdout=devnull) File "/home/stas/anaconda3/envs/main-38/lib/python3.8/subprocess.py", line 364, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['ninja', '--version']' returned non-zero exit status 1. ``` For some reason pytorch fails to run this simple code: ``` # torch/utils/cpp_extension.py def is_ninja_available(): r''' Returns ``True`` if the `ninja <https://ninja-build.org/>`_ build system is available on the system, ``False`` otherwise. ''' with open(os.devnull, 'wb') as devnull: try: subprocess.check_call('ninja --version'.split(), stdout=devnull) except OSError: return False else: return True ``` I suspect that pip does something to `os.devnull` and that's why it fails. This PR proposes a simpler code which doesn't rely on anything but `subprocess.check_output`: ``` def is_ninja_available(): r''' Returns ``True`` if the `ninja <https://ninja-build.org/>`_ build system is available on the system, ``False`` otherwise. ''' try: subprocess.check_output('ninja --version'.split()) except Exception: return False else: return True ``` which doesn't use `os.devnull` and performs the same function. There could be a whole bunch of different exceptions there I think, so I went for the generic one - we don't care why it failed, since this function's only purpose is to suggest whether ninja can be used or not. Let's check ``` python -c "import torch.utils.cpp_extension; print(torch.utils.cpp_extension.is_ninja_available())" True ``` Look ma - no std noise to take care of. (i.e. no need for /dev/null). I was editing the installed environment-wide `cpp_extension.py` file directly, so didn't need to tweak `PYTHONPATH` - I made sure to replace `'ninja --version'.` with something that should fail and I did get `False` for the above command line. I next did a somewhat elaborate cheat to re-package an already existing binary wheel with this corrected version of `cpp_extension.py`, rather than building from source: ``` mkdir /tmp/pytorch-local-channel cd /tmp/pytorch-local-channel # get the latest nightly wheel wget https://download.pytorch.org/whl/nightly/cu110/torch-1.8.0.dev20201215%2Bcu110-cp38-cp38-linux_x86_64.whl # unpack it unzip torch-1.8.0.dev20201215+cu110-cp38-cp38-linux_x86_64.whl # edit torch/utils/cpp_extension.py to fix the python code with the new version as in this PR emacs torch/utils/cpp_extension.py & # pack the files back zip -r torch-1.8.0.dev20201215+cu110-cp38-cp38-linux_x86_64.whl caffe2 torch torch-1.8.0.dev20201215+cu110.dist-info ``` Now I tell pip to use my local channel, plus `--pre` for it to pick up the pre-release as an acceptable wheel ``` # install using this local channel git clone https://github.com/facebookresearch/fairscale/ cd fairscale pip install -v --disable-pip-version-check -e . -f file:///tmp/pytorch-local-channel --pre ``` and voila all works. ``` [...] Successfully installed fairscale ``` I noticed a whole bunch of ninja not found errors in the log, which I think is the same problem with other parts of the build system packages which also use this old check copied all over various projects and build tools, and which the recent pip breaks. ``` writing manifest file '/tmp/pip-modern-metadata-_nsdesbq/fairscale.egg-info/SOURCES.txt' Traceback (most recent call last): File "/home/stas/anaconda3/envs/main-38/bin/ninja", line 5, in <module> from ninja import ninja ModuleNotFoundError: No module named 'ninja' [...] /tmp/pip-build-env-fqflyevr/overlay/lib/python3.8/site-packages/torch/utils/cpp_extension.py:364: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend. warnings.warn(msg.format('we could not find ninja.')) ``` but these don't prevent from the build completing and installing. I suppose these need to be identified and reported to various other projects, but that's another story. The new pip does something to `os.devnull` I think which breaks any code relying on it - I haven't tried to figure out what happens to that stream object, but this PR which removes its usage solves the problem. Also do notice that: ``` git clone https://github.com/facebookresearch/fairscale/ cd fairscale python setup.py bdist_wheel pip install dist/fairscale-0.1.1-cp38-cp38-linux_x86_64.whl ``` works too. So it is really a pip issue. Apologies if the notes are too many, I tried to give the complete picture and probably other projects will need those details as well. Thank you for reading. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49443 Reviewed By: mruberry Differential Revision: D25592109 Pulled By: ezyang fbshipit-source-id: bfce4420c28b614ead48e9686f4153c6e0fbe8b7	2020-12-16 18:02:11 -08:00
Gao, Xiang	d409da0677	Fix CUDA extension ninja build (#49344 ) Summary: I am submitting this PR on behalf of Janne Hellsten(nurpax) from NVIDIA, for the convenience of CLA. Thanks Janne a lot for the contribution! Currently, the ninja build decides whether to rebuild a .cu file or not pretty randomly. And there are actually two issues: First, the arch list in the building command is ordered randomly. When the order changes, it will unconditionally rebuild regardless of the timestamp. Second, the header files are not included in the dependency list, so if the header file changes, it is possible that ninja will not rebuild. This PR fixes both issues. The fix for the second issue requires nvcc >= 10.2. nvcc < 10.2 can still build CUDA extension as it used to be, but it will be unable to see the changes in header files. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49344 Reviewed By: glaringlee Differential Revision: D25540157 Pulled By: ezyang fbshipit-source-id: 197541690d7f25e3ac5ebe3188beb1f131a4c51f	2020-12-16 17:45:12 -08:00
Stas Bekman	02b63858f2	[CUDAExtension] support all visible cards when building a cudaextension (#48891 ) Summary: Currently CUDAExtension assumes that all cards are of the same type on the same machine and builds the extension with compute capability of the 0th card. This breaks later at runtime if the machine has cards of different types. Specifically resulting in: ``` RuntimeError: CUDA error: no kernel image is available for execution on the device ``` when the cards of the types that weren't compiled for are used. (and the error is far from telling what the problem is to the uninitiated) My current setup is: ``` $ CUDA_VISIBLE_DEVICES=0 python -c "import torch; print(torch.cuda.get_device_capability())" (8, 6) $ CUDA_VISIBLE_DEVICES=1 python -c "import torch; print(torch.cuda.get_device_capability())" (6, 1) ``` but the extension was getting built with `-gencode=arch=compute_80,code=sm_80`. This PR: * [x] introduces a loop over all visible at build time devices to ensure the extension will run on all of them (it sorts the new list generated by the loop, so that the output is easier to debug should a card with lower capacity come last) * [x] adds `+PTX` to the last entry of ccs derived from local cards (`if not _arch_list:`) to support other archs * [x] adds a digest of my conversation with ptrblck on slack in the form of docs which hopefully can help others know which archs to support, how to override defaults, when and how to add PTX, etc. Please kindly review that my prose is clear and easy to understand. ptrblck Pull Request resolved: https://github.com/pytorch/pytorch/pull/48891 Reviewed By: ngimel Differential Revision: D25358285 Pulled By: ezyang fbshipit-source-id: 8160f3adebffbc8e592ddfcc3adf153a9dc91557	2020-12-08 14:57:10 -08:00
Jithun Nair	5f62308739	Hipify revamp [REDUX] (#48715 ) Summary: [Refiled version of earlier PR https://github.com/pytorch/pytorch/issues/45451] This PR revamps the hipify module in PyTorch to overcome a long list of shortcomings in the original implementation. However, these improvements are applied only when using hipify to build PyTorch extensions, not for PyTorch or Caffe2 itself. Correspondingly, changes are made to cpp_extension.py to match these improvements. The list of improvements to hipify is as follows: 1. Hipify files in the same directory as the original file, unless there's a "cuda" subdirectory in the original file path, in which case the hipified file will be in the corresponding file path with "hip" subdirectory instead of "cuda". 2. Never hipify the file in-place if changes are introduced due to hipification i.e. always ensure the hipified file either resides in a different folder or has a different filename compared to the original file. 3. Prevent re-hipification of already hipified files. This avoids creation of unnecessary "hip/hip" etc. subdirectories and additional files which have no actual use. 4. Do not write out hipified versions of files if they are identical to the original file. This results in a cleaner output directory, with minimal number of hipified files created. 5. Update header rewrite logic so that it accounts for the previous improvement. 6. Update header rewrite logic so it respects the rules for finding header files depending on whether "" or <> is used. 7. Return a dictionary of mappings of original file paths to hipified file paths from hipify function. 8. Introduce a version for hipify module to allow extensions to contain back-compatible code that targets a specific point in PyTorch where the hipify functionality changed. 9. Update cuda_to_hip_mappings.py to account for the ROCm component subdirectories inside /opt/rocm/include. This also results in cleanup of the Caffe2_HIP_INCLUDE path to remove unnecessary additions to the include path. The list of changes to cpp_extension.py is as follows: 1. Call hipify when building a CUDAExtension for ROCm. 2. Prune the list of source files to CUDAExtension to include only the hipified versions of any source files in the list (if both original and hipified versions of the source file are in the list) 3. Add subdirectories of /opt/rocm/include to the include path for extensions, so that ROCm headers for subcomponent libraries are found automatically cc jeffdaily sunway513 ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/48715 Reviewed By: bdhirsh Differential Revision: D25272824 Pulled By: ezyang fbshipit-source-id: 8bba68b27e41ca742781e1c4d7b07c6f985f040e	2020-12-02 18:03:23 -08:00
Eli Uriegas	780f2b9a9b	torch: Stop using _nt_quote_args from distutils (#48618 ) Summary: They removed the specific function in Python 3.9 so we should just remake the function here and use our own instead of relying on hidden functions from the stdlib Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Fixes https://github.com/pytorch/pytorch/issues/48617 Pull Request resolved: https://github.com/pytorch/pytorch/pull/48618 Reviewed By: samestep Differential Revision: D25230281 Pulled By: seemethere fbshipit-source-id: 57216af40a4ae4dc8bafcf40d2eb3ba793b9b6e2	2020-12-02 16:53:25 -08:00
Taylor Robie	022c929145	Revert "Revert D25199264: Enable callgrind collection for C++ snippets" (#48720 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48720 This reverts commit `6646ff122d`. Test Plan: Imported from OSS Reviewed By: malfet Differential Revision: D25273994 Pulled By: malfet fbshipit-source-id: 61743176dc650136622e1b8f2384bbfbd7a46294	2020-12-02 11:10:11 -08:00
Taylor Robie	07f038aa9d	Add option for cpp_extensions to compile standalone executable (#47862 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47862 Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D25199265 Pulled By: robieta fbshipit-source-id: eceb04dea60b82eb10434099639fa3afa61000ca	2020-12-01 20:03:08 -08:00
Nikita Shulga	8af9f2cc23	Revert D24924736: [pytorch][PR] Hipify revamp Test Plan: revert-hammer Differential Revision: D24924736 (`10b490a3e0`) Original commit changeset: 4af42b8ff4f2 fbshipit-source-id: 7f8f90d55d8a69a2890ec73622fcea559189e381	2020-11-18 11:48:30 -08:00
Jithun Nair	10b490a3e0	Hipify revamp (#45451 ) Summary: This PR revamps the hipify module in PyTorch to overcome a long list of shortcomings in the original implementation. However, these improvements are applied only when using hipify to build PyTorch extensions, not for PyTorch or Caffe2 itself. Correspondingly, changes are made to `cpp_extension.py` to match these improvements. The list of improvements to hipify is as follows: 1. Hipify files in the same directory as the original file, unless there's a "cuda" subdirectory in the original file path, in which case the hipified file will be in the corresponding file path with "hip" subdirectory instead of "cuda". 2. Never hipify the file in-place if changes are introduced due to hipification i.e. always ensure the hipified file either resides in a different folder or has a different filename compared to the original file. 3. Prevent re-hipification of already hipified files. This avoids creation of unnecessary "hip/hip" etc. subdirectories and additional files which have no actual use. 4. Do not write out hipified versions of files if they are identical to the original file. This results in a cleaner output directory, with minimal number of hipified files created. 5. Update header rewrite logic so that it accounts for the previous improvement. 6. Update header rewrite logic so it respects the rules for finding header files depending on whether `""` or `<>` is used. 7. Return a dictionary of mappings of original file paths to hipified file paths from `hipify` function. 8. Introduce a version for hipify module to allow extensions to contain back-compatible code that targets a specific point in PyTorch where the hipify functionality changed. 9. Update `cuda_to_hip_mappings.py` to account for the ROCm component subdirectories inside `/opt/rocm/include`. This also results in cleanup of the `Caffe2_HIP_INCLUDE` path to remove unnecessary additions to the include path. The list of changes to `cpp_extension.py` is as follows: 1. Call `hipify` when building a CUDAExtension for ROCm. 2. Prune the list of source files to CUDAExtension to include only the hipified versions of any source files in the list (if both original and hipified versions of the source file are in the list) 3. Add subdirectories of /opt/rocm/include to the include path for extensions, so that ROCm headers for subcomponent libraries are found automatically cc jeffdaily sunway513 hgaspar lcskrishna ashishfarmer Pull Request resolved: https://github.com/pytorch/pytorch/pull/45451 Reviewed By: ezyang Differential Revision: D24924736 Pulled By: malfet fbshipit-source-id: 4af42b8ff4f21c3782dedb8719b8f9f86b34bd2d	2020-11-18 08:37:49 -08:00
Chester Liu	17a6bc7c1b	Cleanup unused code for Python < 3.6 (#47822 ) Summary: I think these can be safely removed since the min version of supported Python is now 3.6 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47822 Reviewed By: smessmer Differential Revision: D24954936 Pulled By: ezyang fbshipit-source-id: 5d4b2aeb78fc97d7ee4abaf5fb2aae21bf765e8b	2020-11-13 21:37:01 -08:00
peter	d73a8db2d2	Use local env for building CUDA extensions on Windows (#47150 ) Summary: Fixes https://github.com/pytorch/vision/pull/2818#issuecomment-719167504 After activating the VC env multiple times, the following error will be raised when building a CUDA extension. ``` FAILED: C:/tools/MINICO~1/CONDA-~2/TORCHV~1/work/build/temp.win-amd64-3.8/Release/tools/MINICO~1/CONDA-~2/TORCHV~1/work/torchvision/csrc/cuda/PSROIAlign_cuda.obj C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\bin\nvcc -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -DWITH_CUDA -Dtorchvision_EXPORTS -IC:\tools\MINICO~1\CONDA-~2\TORCHV~1\work\torchvision\csrc -I%PREFIX%\lib\site-packages\torch\include -I%PREFIX%\lib\site-packages\torch\include\torch\csrc\api\include -I%PREFIX%\lib\site-packages\torch\include\TH -I%PREFIX%\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\include" -I%PREFIX%\include -I%PREFIX%\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.27.29110\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.27.29110\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.27.29110\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.27.29110\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" -I%PREFIX%\Library\include -c C:\tools\MINICO~1\CONDA-~2\TORCHV~1\work\torchvision\csrc\cuda\PSROIAlign_cuda.cu -o C:\tools\MINICO~1\CONDA-~2\TORCHV~1\work\build\temp.win-amd64-3.8\Release\tools\MINICO~1\CONDA-~2\TORCHV~1\work\torchvision\csrc\cuda\PSROIAlign_cuda.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_35,code=sm_35 -gencode=arch=compute_50,code=sm_50 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_50,code=compute_50 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 'cl.exe' is not recognized as an internal or external command, operable program or batch file. ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/47150 Reviewed By: agolynski Differential Revision: D24706019 Pulled By: ezyang fbshipit-source-id: c13dc29f62d2d12d6a56f33dd450b467a1bf193b	2020-11-10 20:02:06 -08:00
Yuxin Wu	5cba3cec5a	fix extensions build flags on newer GPUs (#47585 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/47352 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47585 Reviewed By: heitorschueroff Differential Revision: D24833654 Pulled By: ezyang fbshipit-source-id: eaec5b8db5f35cac0a74d2858cb054a3853b0990	2020-11-10 11:38:18 -08:00
Simon Geisler	abae12ba41	only set ccbin flag if not provided by user (#47404 ) Summary: Avoid nvcc error if the user specifies c compiler (as pointed out in https://github.com/pytorch/pytorch/issues/47377) Fixes https://github.com/pytorch/pytorch/issues/47377 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47404 Reviewed By: ejguan Differential Revision: D24748833 Pulled By: malfet fbshipit-source-id: 1a4ad1f851c8854795f7f98e28f479a0ff458a00	2020-11-10 07:55:57 -08:00
Nikita Shulga	2b6a720eb1	Update pybind to 2.6.0 (#46415 ) Summary: Preserve PYBIND11 (`63ce3fbde8`) configuration options in `torch._C._PYBIND11 (`63ce3fbde8`)_COMPILER_TYPE` and use them when building extensions Also, use f-strings in `torch.utils.cpp_extension` "Fixes" https://github.com/pytorch/pytorch/issues/46367 Pull Request resolved: https://github.com/pytorch/pytorch/pull/46415 Reviewed By: VitalyFedyunin Differential Revision: D24605949 Pulled By: malfet fbshipit-source-id: 87340f2ed5308266a46ef8f0317316227dab9d4d	2020-10-29 10:53:47 -07:00
Nikita Shulga	42a51148c1	Use f-strings in torch.utils.cpp_extension (#47025 ) Summary: Plus two minor fixes to `torch/csrc/Module.cpp`: - Use iterator of type `Py_ssize_t` for array indexing in `THPModule_initNames` - Fix clang-tidy warning of unneeded defaultGenerator copy by capturing it as `const auto&` Pull Request resolved: https://github.com/pytorch/pytorch/pull/47025 Reviewed By: samestep Differential Revision: D24605907 Pulled By: malfet fbshipit-source-id: c276567d320758fa8b6f4bd64ff46d2ea5d40eff	2020-10-28 21:32:33 -07:00
Guilherme Leobas	789e935304	Annotate torch.nn.cpp (#46490 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/46489 Pull Request resolved: https://github.com/pytorch/pytorch/pull/46490 Reviewed By: zhangguanheng66 Differential Revision: D24509519 Pulled By: ezyang fbshipit-source-id: edffd32ab2ac17ae4bbd44826b71f5cb9f1da1c5	2020-10-23 17:40:32 -07:00
Jithun Nair	65da50c099	Apply hip vs hipcc compilation flags correctly for building extensions (#46273 ) Summary: Fixes issues when building certain PyTorch extensions where the cpp files do NOT compile if flags such as `__HIP_NO_HALF_CONVERSIONS__` are defined. cc jeffdaily Pull Request resolved: https://github.com/pytorch/pytorch/pull/46273 Reviewed By: zou3519 Differential Revision: D24422463 Pulled By: ezyang fbshipit-source-id: 7a43d1f7d59c95589963532ef3bd3c68cb8262be	2020-10-21 11:40:40 -07:00
Alexander Grund	5b0f400488	Replace list(map(...)) constructs by list comprehensions (#46461 ) Summary: As discussed in https://github.com/pytorch/pytorch/issues/46392 this makes the code more readable and possibly more performant. It also fixes a bug detected by this where the argument order of `map` was confused: `030a24906e (diff-5bb26bd3a23ee3bb540aeadcc0385df2a4e48de39f87ed9ea76b21990738fe98L1537-R1537)` Fixes https://github.com/pytorch/pytorch/issues/46392 Pull Request resolved: https://github.com/pytorch/pytorch/pull/46461 Reviewed By: ailzhang Differential Revision: D24367015 Pulled By: ezyang fbshipit-source-id: d55a67933cc22346b00544c9671f09982ad920e7	2020-10-19 18:42:49 -07:00
Alexandre Saint	c734961e26	[cpp-extensions] Ensure default extra_compile_args (#45956 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/45835 Pull Request resolved: https://github.com/pytorch/pytorch/pull/45956 Reviewed By: ngimel Differential Revision: D24162289 Pulled By: albanD fbshipit-source-id: 9ba2ad51e818864f6743270212ed94d86457f4e6	2020-10-09 07:33:28 -07:00
Xiang Gao	2fa062002e	CUDA BFloat16 infrastructure (#44925 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44925 Reviewed By: agolynski Differential Revision: D23783910 Pulled By: ngimel fbshipit-source-id: dacac2ad87d58056bdc68bfe0b7ab1de5c2af0d8	2020-10-02 16:21:30 -07:00
Xiang Gao	0a15646e15	CUDA RTX30 series support (#45489 ) Summary: I also opened a PR on cmake upstream: https://gitlab.kitware.com/cmake/cmake/-/merge_requests/5292 Pull Request resolved: https://github.com/pytorch/pytorch/pull/45489 Reviewed By: zhangguanheng66 Differential Revision: D23997844 Pulled By: ezyang fbshipit-source-id: 4e7443dde9e70632ee429184f0d51cb9aa5a98b5	2020-09-29 18:19:23 -07:00
Xiang Gao	20ac736200	Remove py2 compatible future imports (#44735 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44735 Reviewed By: mruberry Differential Revision: D23731306 Pulled By: ezyang fbshipit-source-id: 0ba009a99e475ddbe22981be8ac636f8a1c8b02f	2020-09-16 12:55:57 -07:00
Nikita Shulga	4134b7abfa	Pass CC env variable as ccbin argument to nvcc (#43931 ) Summary: This is the common behavior when one builds PyTorch (or any other CUDA project) using CMake, so it should be held true for Torch CUDA extensions as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/43931 Reviewed By: ezyang, seemethere Differential Revision: D23441793 Pulled By: malfet fbshipit-source-id: 1af392107a94840331014fda970ef640dc094ae4	2020-09-01 17:26:08 -07:00
Akihiro Nitta	f17d7a5556	Fix exception chaining in `torch/` (#43836 ) Summary: ## Motivation Fixes https://github.com/pytorch/pytorch/issues/43770. ## Description of the change This PR fixes exception chaining only in files under `torch/` where appropriate. To fix exception chaining, I used either: 1. `raise new_exception from old_exception` where `new_exception` itself seems not descriptive enough to debug or `old_exception` delivers valuable information. 2. `raise new_exception from None` where raising both of `new_exception` and `old_exception` seems a bit noisy and redundant. I subjectively chose which one to use from the above options. ## List of lines containing raise in except clause: I wrote [this simple script](https://gist.github.com/akihironitta/4223c1b32404b36c1b349d70c4c93b4d) using [ast](https://docs.python.org/3.8/library/ast.html#module-ast) to list lines where `raise`ing in `except` clause. - [x] `000739c31a/torch/jit/annotations.py (L35)` - [x] `000739c31a/torch/jit/annotations.py (L150)` - [x] `000739c31a/torch/jit/annotations.py (L158)` - [x] `000739c31a/torch/jit/annotations.py (L231)` - [x] `000739c31a/torch/jit/_trace.py (L432)` - [x] `000739c31a/torch/nn/utils/prune.py (L192)` - [x] `000739c31a/torch/cuda/nvtx.py (L7)` - [x] `000739c31a/torch/utils/cpp_extension.py (L1537)` - [x] `000739c31a/torch/utils/tensorboard/_pytorch_graph.py (L292)` - [x] `000739c31a/torch/utils/data/dataloader.py (L835)` - [x] `000739c31a/torch/utils/data/dataloader.py (L849)` - [x] `000739c31a/torch/utils/data/dataloader.py (L856)` - [x] `000739c31a/torch/testing/_internal/common_utils.py (L186)` - [x] `000739c31a/torch/testing/_internal/common_utils.py (L189)` - [x] `000739c31a/torch/testing/_internal/common_utils.py (L424)` - [x] `000739c31a/torch/testing/_internal/common_utils.py (L1279)` - [x] `000739c31a/torch/testing/_internal/common_utils.py (L1283)` - [x] `000739c31a/torch/testing/_internal/common_utils.py (L1356)` - [x] `000739c31a/torch/testing/_internal/common_utils.py (L1388)` - [x] `000739c31a/torch/testing/_internal/common_utils.py (L1391)` - [ ] `000739c31a/torch/testing/_internal/common_utils.py (L1412)` - [x] `000739c31a/torch/testing/_internal/codegen/random_topo_test.py (L310)` - [x] `000739c31a/torch/testing/_internal/codegen/random_topo_test.py (L329)` - [x] `000739c31a/torch/testing/_internal/codegen/random_topo_test.py (L332)` - [x] `000739c31a/torch/testing/_internal/jit_utils.py (L183)` - [x] `000739c31a/torch/testing/_internal/common_nn.py (L4789)` - [x] `000739c31a/torch/onnx/utils.py (L367)` - [x] `000739c31a/torch/onnx/utils.py (L659)` - [x] `000739c31a/torch/onnx/utils.py (L892)` - [x] `000739c31a/torch/onnx/utils.py (L897)` - [x] `000739c31a/torch/serialization.py (L108)` - [x] `000739c31a/torch/serialization.py (L754)` - [x] `000739c31a/torch/distributed/rpc/_testing/faulty_agent_backend_registry.py (L76)` - [x] `000739c31a/torch/distributed/rpc/backend_registry.py (L260)` - [x] `000739c31a/torch/distributed/distributed_c10d.py (L184)` - [x] `000739c31a/torch/_utils_internal.py (L57)` - [x] `000739c31a/torch/hub.py (L494)` - [x] `000739c31a/torch/contrib/_tensorboard_vis.py (L16)` - [x] `000739c31a/torch/distributions/lowrank_multivariate_normal.py (L100)` - [x] `000739c31a/torch/distributions/constraint_registry.py (L142)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/43836 Reviewed By: ailzhang Differential Revision: D23431212 Pulled By: malfet fbshipit-source-id: 5f7f41b391164a5ad0efc06e55cd58c23408a921	2020-08-31 20:26:23 -07:00
Nikita Shulga	6753157c5a	Enable torch.utils typechecks (#42960 ) Summary: Fix typos in torch.utils/_benchmark/README.md Add empty __init__.py to examples folder to make example invocations from README.md correct Fixed uniform distribution logic generation when mixval and maxval are None Fixes https://github.com/pytorch/pytorch/issues/42984 Pull Request resolved: https://github.com/pytorch/pytorch/pull/42960 Reviewed By: seemethere Differential Revision: D23095399 Pulled By: malfet fbshipit-source-id: 0546ce7299b157d9a1f8634340024b10c4b7e7de	2020-08-13 15:24:56 -07:00
Ralf Gommers	bcab2d6848	And type annotations for cpp_extension, utils.data, signal_handling (#42647 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42647 Reviewed By: ezyang Differential Revision: D22967041 Pulled By: malfet fbshipit-source-id: 35e124da0be56934faef56834a93b2b400decf66	2020-08-06 09:42:07 -07:00
Thomas Viehmann	0f78e596ba	ROCm: Fix linking of custom ops in load_inline (#41257 ) Summary: Previously we did not link against amdhip64 (roughly equivalent to cudart). Apparently, the recent RTDL_GLOBAL fixes prevent the extensions from finding the symbols needed for launching kernels. Pull Request resolved: https://github.com/pytorch/pytorch/pull/41257 Reviewed By: zou3519 Differential Revision: D22573288 Pulled By: ezyang fbshipit-source-id: 89f9329b2097df26785e2f67e236d60984d40fdd	2020-07-17 12:14:50 -07:00
Edward Yang	22c7d183f7	If ninja is being used, force build_ext to run. (#40837 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40837 As ninja has accurate dependency tracking, if there is nothing to do, then we will very quickly noop. But this is important for correctness: if a change was made to a header that is not listed explicitly in the distutils Extension, then distutils will come to the wrong conclusion about whether or not recompilation is needed (but Ninja will work it out.) This caused https://github.com/pytorch/vision/issues/2367 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D22340930 Pulled By: ezyang fbshipit-source-id: 481b74f6e2cc78159d2a74d413751cf7cf16f592	2020-07-07 09:49:31 -07:00
Pavel Belevich	95e51bb7f8	change BuildExtension.with_options to return a class not a c-tor (#40121 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40121 Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D22076634 Pulled By: pbelevich fbshipit-source-id: a89740baf75208065e418d7f972eeb52db9ee3cf	2020-06-17 12:09:09 -07:00
lixinyu	7cb4eae8b1	correct some cpp extension code usages and documents (#39766 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39766 Test Plan: Imported from OSS Differential Revision: D21967284 Pulled By: glaringlee fbshipit-source-id: 8597916bee247cb5f8c82ed8297119d2f3a72170	2020-06-10 08:31:22 -07:00
Xiang Gao	b3fac8af6b	Initial support for building on Ampere GPU, CUDA 11, cuDNN 8 (#39277 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39277 This PR contains initial changes that makes PyTorch build with Ampere GPU, CUDA 11, and cuDNN 8. TF32 related features will not be included in this PR. Test Plan: Imported from OSS Differential Revision: D21832814 Pulled By: malfet fbshipit-source-id: 37f9c6827e0c26ae3e303580f666584230832d06	2020-06-02 10:03:42 -07:00
ashishfarmer	53b55d8f38	Use ninja build as default for HIPExtensions (#38939 ) Summary: This PR adds the following changes: 1. It sets the default extension build to use ninja 2. Adds HIPCC flags to the host code compile string for ninja builds. This is needed when host code makes HIP API calls cc: ezyang jeffdaily Pull Request resolved: https://github.com/pytorch/pytorch/pull/38939 Differential Revision: D21721905 Pulled By: ezyang fbshipit-source-id: 75206838315a79850ecf86a78391a31ba5ee97cb	2020-05-27 11:35:19 -07:00
Yuxin Wu	0e2a0478af	Support paths with spaces when building ninja extension (#38670 ) Summary: Generate the following `build.ninja` file and can successfully build: ``` cflags = -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA '-I/scratch/yuxinwu/space space/detectron2/layers/csrc' -I/private/home/yuxinwu/miniconda3/lib/python3.7 /site-packages/torch/include -I/private/home/yuxinwu/miniconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/private/home/yuxinwu/miniconda3/lib/python3.7/site-packages/torc h/include/TH -I/private/home/yuxinwu/miniconda3/lib/python3.7/site-packages/torch/include/THC -I/public/apps/cuda/10.1/include -I/private/home/yuxinwu/miniconda3/include/python3.7m -c post_cflags = -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14 cuda_cflags = -DWITH_CUDA '-I/scratch/yuxinwu/space space/detectron2/layers/csrc' -I/private/home/yuxinwu/miniconda3/lib/python3.7/site-packages/torch/include -I/private/home/yuxinwu/miniconda3/li b/python3.7/site-packages/torch/include/torch/csrc/api/include -I/private/home/yuxinwu/miniconda3/lib/python3.7/site-packages/torch/include/TH -I/private/home/yuxinwu/miniconda3/lib/python3.7/site -packages/torch/include/THC -I/public/apps/cuda/10.1/include -I/private/home/yuxinwu/miniconda3/include/python3.7m -c cuda_post_cflags = -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_ OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -ccbin=/public/apps/gcc/7.1.0/bin/gcc -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_70,code=sm_70 -std=c++14 ldflags = rule compile command = $cxx -MMD -MF $out.d $cflags -c $in -o $out $post_cflags depfile = $out.d deps = gcc rule cuda_compile command = $nvcc $cuda_cflags -c $in -o $out $cuda_post_cflags build /scratch/yuxinwu/space$ space/build/temp.linux-x86_64-3.7/scratch/yuxinwu/space$ space/detectron2/layers/csrc/vision.o: compile /scratch/yuxinwu/space$ space/detectron2/layers/csrc/vision.c$ p build /scratch/yuxinwu/space$ space/build/temp.linux-x86_64-3.7/scratch/yuxinwu/space$ space/detectron2/layers/csrc/box_iou_rotated/box_iou_rotated_cpu.o: compile /scratch/yuxinwu/space$ space/de$ ectron2/layers/csrc/box_iou_rotated/box_iou_rotated_cpu.cpp build /scratch/yuxinwu/space$ space/build/temp.linux-x86_64-3.7/scratch/yuxinwu/space$ space/detectron2/layers/csrc/ROIAlignRotated/ROIAlignRotated_cpu.o: compile /scratch/yuxinwu/space$ space/de$ ectron2/layers/csrc/ROIAlignRotated/ROIAlignRotated_cpu.cpp build /scratch/yuxinwu/space$ space/build/temp.linux-x86_64-3.7/scratch/yuxinwu/space$ space/detectron2/layers/csrc/nms_rotated/nms_rotated_cpu.o: compile /scratch/yuxinwu/space$ space/detectron2$ layers/csrc/nms_rotated/nms_rotated_cpu.cpp build /scratch/yuxinwu/space$ space/build/temp.linux-x86_64-3.7/scratch/yuxinwu/space$ space/detectron2/layers/csrc/ROIAlign/ROIAlign_cpu.o: compile /scratch/yuxinwu/space$ space/detectron2/layer$ /csrc/ROIAlign/ROIAlign_cpu.cpp ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/38670 Differential Revision: D21689613 Pulled By: ppwwyyxx fbshipit-source-id: 1f71b12433e18f6b0c6aad5e1b390b4438654563	2020-05-21 14:57:40 -07:00
peter	a40049fd2a	Better handling for msvc env when compiling cpp extensions (#38862 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/38861#issuecomment-631934636. 1. Error out if msvc env is activated but `DISTUTILS_USE_SDK` is not set. 2. Attempt to activate msvc env before running ninja build Pull Request resolved: https://github.com/pytorch/pytorch/pull/38862 Differential Revision: D21686343 Pulled By: ezyang fbshipit-source-id: 38b366654e2d0376dbdd21276689772b78e9718e	2020-05-21 12:52:22 -07:00
peter	4e46c95826	Fix cpp extension build failure if path contains space (#38860 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38860 Differential Revision: D21686335 Pulled By: ezyang fbshipit-source-id: 2675f4f70b48ae3b58ea597a2b584b446d03c704	2020-05-21 12:36:27 -07:00
lixinyu	5a979fcb99	allow user passing relative paths in include_dirs within setuptools.setup (#38264 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38264 Test Plan: Imported from OSS Differential Revision: D21509277 Pulled By: glaringlee fbshipit-source-id: b0bc17d375a89b96b1bdacde5987b4f4baa9468e	2020-05-13 20:00:12 -07:00
ashish	5a386a0a78	Fix ldflags string for HIPExtensions (#38047 ) Summary: This pull request adds a check for ROCm environment and skips adding CUDA specific flags for the scenario when a pytorch extension is built on ROCm. ezyang jeffdaily Pull Request resolved: https://github.com/pytorch/pytorch/pull/38047 Differential Revision: D21470507 Pulled By: ezyang fbshipit-source-id: 5af2d7235e306c7aa9a5f7fc8760025417383069	2020-05-07 20:39:01 -07:00
ashishfarmer	402f635bbe	Enable ahead of time compilation for HIPExtensions using ninja (#37800 ) Summary: This pull request enables ahead of time compilation of HIPExtensions with ninja by setting appropriate compilation flags for ROCm environment. Also, this enables the unit test for testing cuda_extensions on ROCm as well as removing test for ahead of time compilation of extensions with ninja from ROCM_BLACKLIST ezyang jeffdaily Pull Request resolved: https://github.com/pytorch/pytorch/pull/37800 Differential Revision: D21408148 Pulled By: soumith fbshipit-source-id: 146f4ffb3418f3534e6ce86805d3fe9c3eae84e1	2020-05-05 20:53:35 -07:00
peter	7c4bda7e6f	Eliminate warnings for cpp extensions on Windows (#37400 ) Summary: Improve the readability of the logs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/37400 Differential Revision: D21302597 Pulled By: ezyang fbshipit-source-id: b8cbd33f95b6839ad4c6930bed8750c9b5a2ef7a	2020-04-30 20:28:03 -07:00
SsnL	13013848d5	Fix cpp_ext build dir create permission (#34239 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/34238 Pull Request resolved: https://github.com/pytorch/pytorch/pull/34239 Differential Revision: D21328036 Pulled By: soumith fbshipit-source-id: dac2735383b1a689139af5a23f61ccbebd1fd6c1	2020-04-30 11:30:07 -07:00
Lukas Koestler	0048243f70	Check compiler -v to determine compiler (fix #33701 ) (#37293 ) Summary: As described in the issue (https://github.com/pytorch/pytorch/issues/33701) the compiler check for building cpp extensions does not work with ccache. In this case we check compiler -v to determine which compiler is actually used and check it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/37293 Differential Revision: D21256913 Pulled By: ezyang fbshipit-source-id: 5483a10cc2dbcff98a7f069ea9dbc0c12b6502dc	2020-04-27 10:49:04 -07:00
David Reiss	e75fb4356b	Remove (most) Python 2 support from Python code (#35615 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35615 Python 2 has reached end-of-life and is no longer supported by PyTorch. Now we can clean up a lot of cruft that we put in place to support it. These changes were all done manually, and I skipped anything that seemed like it would take more than a few seconds, so I think it makes sense to review it manually as well (though using side-by-side view and ignoring whitespace change might be helpful). Test Plan: CI Differential Revision: D20842886 Pulled By: dreiss fbshipit-source-id: 8cad4e87c45895e7ce3938a88e61157a79504aed	2020-04-22 09:23:14 -07:00
Thomas Viehmann	d070c0bcf0	ROCm: enable cpp_extensions.load/load_inline (#35897 ) Summary: This enables cpp_extensions.load/load_inline. This works by hipify-ing cuda sources. Also enable tests. CuDNN/MIOpen extensions aren't yet supported, I propose to not do this in this PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35897 Differential Revision: D20983279 Pulled By: ezyang fbshipit-source-id: a5d0f5ac592d04488a6a46522c58e2ee0a6fd57c	2020-04-13 11:44:08 -07:00
lizz	5d1205bf02	Suppress output when checking hipcc (#35789 ) Summary: Otherwise, it will print some message when hipcc is not found. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35789 Differential Revision: D20793089 Pulled By: ezyang fbshipit-source-id: 4b3cb29fb1d74a1931603ee01e669013ccae9685	2020-04-01 13:03:21 -07:00
hainq	a0dc36e501	[Windows] Fix torch_cuda's forced link (#35659 ) Summary: The current config on `master` yields the following errors when build from source on Windows with CMake and Visual Studio 2019. ``` Severity Code Description Project File Line Suppression State Error LNK2001 unresolved external symbol \?warp_size@cuda@at@YAHXZ\ torch D:\AI\pytorch\build_libtorch\caffe2\LINK 1 Severity Code Description Project File Line Suppression State Error LNK1120 1 unresolved externals torch D:\AI\pytorch\build_libtorch\bin\Release\torch.dll 1 Severity Code Description Project File Line Suppression State Error LNK2001 unresolved external symbol \?warp_size@cuda@at@YAHXZ\ caffe2_observers D:\AI\pytorch\build_libtorch\modules\observers\LINK 1 Severity Code Description Project File Line Suppression State Error LNK1120 1 unresolved externals caffe2_observers D:\AI\pytorch\build_libtorch\bin\Release\caffe2_observers.dll 1 Severity Code Description Project File Line Suppression State Error LNK2001 unresolved external symbol \?warp_size@cuda@at@YAHXZ\ caffe2_detectron_ops_gpu D:\AI\pytorch\build_libtorch\modules\detectron\LINK 1 Severity Code Description Project File Line Suppression State Error LNK1120 1 unresolved externals caffe2_detectron_ops_gpu D:\AI\pytorch\build_libtorch\bin\Release\caffe2_detectron_ops_gpu.dll 1 ``` This change at least fixes the above errors in that specific setting. Do you think it makes sense to get this merged or will it break other settings? Pull Request resolved: https://github.com/pytorch/pytorch/pull/35659 Differential Revision: D20735907 Pulled By: ezyang fbshipit-source-id: eb8fa1e69aaaa5af2da3a76963ddc910bb716479	2020-03-30 13:59:31 -07:00
Nikita Shulga	0f0a5b11b8	Disable C4251 when compiling cpp_extensions on Windows (#35272 ) Summary: Otherwise, VC++ will warn that every exposed C++ symbol, for example: ``` include\c10/core/impl/LocalDispatchKeySet.h(53): warning C4251: 'c10::impl::LocalDispatchKeySet::included_': class 'c10::DispatchKeySet' needs to have dll-interface to be used by clients of struct 'c10::impl::LocalDispatchKeySet' ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/35272 Test Plan: CI Differential Revision: D20623005 Pulled By: malfet fbshipit-source-id: b635b674159bb9654e4e1a1af4394c4f36fe35bd	2020-03-24 11:08:28 -07:00
peterjc123	9e6cd98c3f	Ensure torch_cuda is linked against on Windows (#34288 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/31611. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34288 Differential Revision: D20314251 Pulled By: seemethere fbshipit-source-id: 15ab2d4de665d553a1622a2d366148697deb6c02	2020-03-12 12:16:44 -07:00
Yuxin Wu	20b18a58f1	Update compiler warning about ABI compatibility (#34472 ) Summary: `3ac4267763` already forces pytorch to use gcc>=5 everywhere Pull Request resolved: https://github.com/pytorch/pytorch/pull/34472 Differential Revision: D20345134 Pulled By: ezyang fbshipit-source-id: 3ce706405e8784cac5c314500466b5f988ad31bf	2020-03-10 08:12:07 -07:00
ashish	616beb1412	[ROCm] Added support for pytorch extensions to use HIP (#32669 ) Summary: This pull request has changes for: 1. Enabling a torch module with HIP code to be compiled by cpp_extensions.py 2. Fixes for hipify module to be able to be used by a torch extension cc: ezyang iotamudelta jeffdaily Pull Request resolved: https://github.com/pytorch/pytorch/pull/32669 Differential Revision: D20033893 Pulled By: zou3519 fbshipit-source-id: fd6ddc8cdcd3930f41008636bb2bc9dd26cdb008	2020-02-21 12:10:02 -08:00
peter	ffe327f7d9	Revert "Disable flaky test TestCppExtensionAOT.test_cuda_extension in… (#33404 ) Summary: … Windows CI (https://github.com/pytorch/pytorch/issues/33282)" This reverts commit `5b922918d0`. Fixes https://github.com/pytorch/pytorch/issues/33270. Pull Request resolved: https://github.com/pytorch/pytorch/pull/33404 Differential Revision: D19972594 Pulled By: ezyang fbshipit-source-id: c8f67536fd6e4b7135171d621ad671b1b2a21fd4	2020-02-20 09:08:29 -08:00
Peter Bell	44af8ee6cd	Add pybind11 exception translator (#30588 ) Summary: Closes https://github.com/pytorch/pytorch/issues/30027 The idea here is that you can bind a function with `pybind11` in a single line and without modifying the function: ```cpp m.def("foo", foo, py::call_guard<torch::PyWarningHandler>()); ``` Where warnings are handled by the [`call_guard`](https://pybind11.readthedocs.io/en/stable/advanced/functions.html#call-guard) and exceptions are handled by the `pybind11` exception translator. To do this, I have added support for handling C++ exceptions in `torch::PyWarningHandler`'s destructor without setting the python error state before hand. Pull Request resolved: https://github.com/pytorch/pytorch/pull/30588 Differential Revision: D19905626 Pulled By: albanD fbshipit-source-id: 90c0a5e298b123cc0c8ab9c52c91be4e96ea47c6	2020-02-18 11:33:29 -08:00
Richard Zou	28c5213a97	Add mechanism to pass a number of workers to cpp extensions (#33346 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33346 Fixes #33091 This PR lets users control the number of workers that cpp extensions uses through the environment variable `MAX_JOBS`. If the environment variable is a non-negative integer we use that many threads; otherwise, ninja falls back to the default. I chose to use the name `MAX_JOBS` because we use it in PyTorch already to control the number of workers PyTorch builds with. There is a risk that users of cpp extensions already have `MAX_JOBS` set but we are hoping that that risk is small and/or it means semantically the same thing. Test Plan: - tested locally Differential Revision: D19911645 Pulled By: zou3519 fbshipit-source-id: d20ed42de4f845499ed38f1a1c73e9ccb620f780	2020-02-18 06:48:11 -08:00
peter	769abddfa3	Build ahead-of-time C++ extensions with ninja on windows Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33084 Differential Revision: D19817361 Pulled By: ezyang fbshipit-source-id: 95a6d0ffa9beb6885c8a41688621b33da51706ae	2020-02-11 17:50:09 -08:00
Richard Zou	6209412647	Add option to use ninja to compile ahead-of-time cpp_extensions (#32495 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32495 Background ------------------------------ Previously, ninja was used to compile+link inline cpp_extensions and ahead-of-time cpp_extensions were compiled with distutils. This PR adds the ability to compile (but not link) ahead-of-time cpp_extensions with ninja. The main motivation for this is to speed up cpp_extension builds: distutils does not make use of parallelism. With this PR, using the new option, on my machine, - torchvision compilation goes from 3m43s to 49s - nestedtensor compilation goes from 2m0s to 28s. User-facing changes ------------------------------ I added a `use_ninja` flag to BuildExtension. This defaults to `True`. When `use_ninja` is True: - it will attempt to use ninja. - If we cannot use ninja, then this throws a warning and falls back to distutils. - Situations we cannot use ninja: Windows (NYI, I'll open a new issue for this), if ninja cannot be found on the system. Implementation Details ------------------------------ This PR makes this change in two steps. Please me know if it would be easier to review this if I split this up into a stacked diff. Those changes are: 1) refactor _write_ninja_file to separate the policy (what compiler flags to pass) from the mechanism (how to write the ninja file and do compilation). 2) call _write_ninja_file and _run_ninja_build while building ahead-of-time cpp_extensions. These are only used to compile objects; distutils still handles the linking. Change 1: refactor _write_ninja_file to seperate policy from mechanism - I split _write_ninja_file into: _write_ninja_file and _write_ninja_file_to_build_library - I renamed _build_extension_module to _run_ninja_build Change 2: Call _write_ninja_file while building ahead-of-time cpp_extensions - _write_ninja_file_and_compile_objects calls _write_ninja_file to only build object files. - We monkey-patch distutils.CCompiler.compile to call _write_ninja_files_and_compile_objects - distutils still handles the linking step. The linking step is not a bottleneck so it was not a concern. - This change only works on unix-based systems. Our code for windows goes down a different codepath and I did not want to mess with that. - If a system does not support ninja, we raise a warning and fall back to the original compilation path. Test Plan ------------------------------ Adhoc testing - I built torchvision using pytorch master and printed out the build commands. Next, I used this branch to build torchvision and looked at the ninja file. I compared the ninja file with the build commands and asserted that they were functionally the same. - I repeated the above for pytorch/nestedtensor. PyTorch test suite - I split `test_cpp_extensions` into `test_cpp_extensions_aot` and `test_cpp_extensions_jit`. The AOT (ahead-of-time) version tests ahead-of-time and the JIT version tests just-in-time (not to be confused with TorchScript) - `test_cpp_extensions_aot` gets run TWICE by run_test.py, once with a module that was built with ninja, and once with a module that was built without ninja. - run_test.py asserts that when we are building with use_ninja=True, ninja is actually available on the system. Test Plan: Imported from OSS Differential Revision: D19730432 Pulled By: zou3519 fbshipit-source-id: 819590d01cf65e8da5a1e8019b8b3084792fee90	2020-02-05 18:49:29 -08:00
peter	1e5aead35b	Make cuda search process of cpp extension quiet (#32620 ) Summary: Fixes https://discuss.pytorch.org/t/error-with-cpp-extentions/67559. Pull Request resolved: https://github.com/pytorch/pytorch/pull/32620 Differential Revision: D19576164 Pulled By: soumith fbshipit-source-id: 076229322375774bec03ef2632fc233000c15391	2020-01-26 20:26:43 -08:00
Brian Wignall	f326045b37	Fix typos, via a Levenshtein-type corrector (#31523 ) Summary: Should be non-semantic. Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos, with https://github.com/bwignall/typochecker to help automate the checking. Uses an updated version of the tool used in https://github.com/pytorch/pytorch/pull/30606 . Pull Request resolved: https://github.com/pytorch/pytorch/pull/31523 Differential Revision: D19216749 Pulled By: mrshenli fbshipit-source-id: 7fd489cb9a77cd7e4950c1046f925d57524960ea	2020-01-17 16:03:19 -08:00
Edward Yang	8614860210	Uniformly apply Windows logic in cpp_extensions everywhere (#31161 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31161 Previously, it wasn't necessary to specify `DT_NEEDED` in C++ extensions on Linux (aka pass `-l` flags) because all of the symbols would have already been loaded with `RTLD_GLOBAL`, so there wouldn't be any undefined symbols. But when we switch to loading `_C` with `RTLD_LOCAL`, it's now necessary for all the C++ extensions to know what libraries to link with. The resulting code is clearer and more uniform, so it's wins all around. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D19262578 Pulled By: ezyang fbshipit-source-id: a893cc96f2e9aad1c064a6de4f7ccf79257dec3f	2020-01-09 07:28:11 -08:00
Edward Yang	9c9d3cd550	Revert D19262570: Fix race condition when creating build dir Test Plan: revert-hammer Differential Revision: D19262570 Original commit changeset: bb18c72e4264 fbshipit-source-id: 40675ef6ef4c98629deaaef0b25956f92534ff50	2020-01-03 11:17:42 -08:00
Kaiyu Shi	8c425dd201	Fix race condition when creating build dir (#30956 ) Summary: The original `check-and-act` style can raise `FileExistsError` when multiple processes are jit-compiling the extension on the same node. Pull Request resolved: https://github.com/pytorch/pytorch/pull/30956 Differential Revision: D19262570 Pulled By: ezyang fbshipit-source-id: bb18c72e42648770b47f9378ac7c3929c3c03efc	2020-01-03 07:58:26 -08:00
Richard Zou	9305f44854	Remove BUILD_NAMEDTENSOR from codegen and .cu files (#31047 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31047 Changelist: - remove BUILD_NAMEDTENSOR from .cu files - remove BUILD_NAMEDTENSOR special handling in function_wrapper.py - remove BUILD_NAMEDTENSOR from cpp_extension.py. This code actually did nothing because we always compile with BUILD_NAMEDTENSOR. Test Plan: - run tests Differential Revision: D18908442 Pulled By: zou3519 fbshipit-source-id: b239e24de58580adaf3cef573350773a38b1e4f0	2019-12-11 08:49:56 -08:00
Edward Yang	38986e1dea	Split libtorch.so back into libtorch_{cpu,cuda,hip} (#30315 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30315 The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. This is a reland of https://github.com/pytorch/pytorch/pull/29731 but I've extracted all of the prep work into separate PRs which can be landed before this one. Some things of note: * torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) * The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/libprotobuf.a(arena.cc.o) is referenced by DSO" * A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly * I had to torch_cpu/torch_cuda caffe2_interface_library so that they get whole-archived linked into torch when you statically link. And I had to do this in an exported fashion because torch needs to depend on torch_cpu_library. In the end I exported everything and removed the redefinition in the Caffe2Config.cmake. However, I am not too sure why the old code did it in this way in the first place; however, it doesn't seem to have broken anything to switch it this way. * There's some uses of `__HIP_PLATFORM_HCC__` still in `torch_cpu` code, so I had to apply it to that library too (UGH). This manifests as a failer when trying to run the CUDA fuser. This doesn't really matter substantively right now because we still in-place HIPify, but it would be good to fix eventually. This was a bit difficult to debug because of an unrelated HIP bug, see https://github.com/ROCm-Developer-Tools/HIP/issues/1706 Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D18790941 Pulled By: ezyang fbshipit-source-id: 01296f6089d3de5e8365251b490c51e694f2d6c7	2019-12-04 08:04:57 -08:00
Sebastian Messmer	bc2e6d10fa	Back out "Revert D17908478: Switch PyTorch/Caffe2 to C++14" Summary: Original commit changeset: 775d2e29be0b Test Plan: CI Reviewed By: mruberry Differential Revision: D18775520 fbshipit-source-id: a350b3f86b66d97241f208786ee67e9a51172eac	2019-12-03 14:33:43 -08:00
Sebastian Messmer	a2ed50c920	Revert D17908478: Switch PyTorch/Caffe2 to C++14 Test Plan: revert-hammer Differential Revision: D17908478 Original commit changeset: 6e340024591e fbshipit-source-id: 775d2e29be0bc3a0db64f164c8960c44d4877d5d	2019-11-27 14:57:05 -08:00
Sebastian Messmer	d0acc9c085	Switch PyTorch/Caffe2 to C++14 (#30406 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30406 ghstack-source-id: 94642238 Test Plan: waitforsandcastle Differential Revision: D17908478 fbshipit-source-id: 6e340024591ec2c69521668022999df4a33b4ddb	2019-11-27 10:47:31 -08:00
Junjie Bai	352731bd6e	Revert D18632773: Split libtorch.so back into libtorch_{cpu,cuda,hip} Test Plan: revert-hammer Differential Revision: D18632773 Original commit changeset: ea717c81e0d7 fbshipit-source-id: 18601439f9f81c9f389020e5a0e4e04adb21772d	2019-11-21 15:01:09 -08:00
Edward Yang	ec30d9028a	Split libtorch.so back into libtorch_{cpu,cuda,hip} (#29731 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29731 The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. Some subtleties about the patch: - There were a few functions that crossed CPU-CUDA boundary without API macros. I just added them, easy enough. An inverse situation was aten/src/THC/THCTensorRandom.cu where we weren't supposed to put API macros directly in a cpp file. - DispatchStub wasn't getting all of its symbols related to static members on DispatchStub exported properly. I tried a few fixes but in the end I just moved everyone off using DispatchStub to dispatch CUDA/HIP (so they just use normal dispatch for those cases.) Additionally, there were some mistakes where people incorrectly were failing to actually import the declaration of the dispatch stub, so added includes for those cases. - torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) - The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 - In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/l ibprotobuf.a(arena.cc.o) is referenced by DSO" - A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly. This situation also happens with custom C++ extensions. - There's a ROCm compiler bug where extern "C" on functions is not respected. There's a little workaround to handle this. - Because I was too lazy to check if HIPify was converting TORCH_CUDA_API into TORCH_HIP_API, I just made it so HIP build also triggers the TORCH_CUDA_API macro. Eventually, we should translate and keep the nature of TORCH_CUDA_API constant in all cases. Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D18632773 Pulled By: ezyang fbshipit-source-id: ea717c81e0d7554ede1dc404108603455a81da82	2019-11-21 11:27:33 -08:00
albanD	c0104a1c89	Fix typo in comment in cpp_extension (#30028 ) Summary: From https://github.com/pytorch/pytorch/issues/26614 Pull Request resolved: https://github.com/pytorch/pytorch/pull/30028 Differential Revision: D18597666 Pulled By: albanD fbshipit-source-id: 93bf0e4ee34a63df4b544d44f630a9c0fc95fd83	2019-11-20 07:16:48 -08:00
Alban Desmaison	0ff1696c75	add pybind version of HANDLE_TH_ERRORS Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26614 Test Plan: Imported from OSS Differential Revision: D18249634 Pulled By: albanD fbshipit-source-id: 25503f368926e0f3633c5af0f222c9bb4729f342	2019-11-07 08:35:11 -08:00
Ralf Gommers	92c63d90e8	Remove support for old architectures in cpp_extension and CMake (#24442 ) Summary: This is a follow-up to gh-23408. No longer supported are any arches < 3.5 (numbers + 'Fermi' and 'Kepler+Tegra'). Pull Request resolved: https://github.com/pytorch/pytorch/pull/24442 Differential Revision: D16889283 Pulled By: ezyang fbshipit-source-id: 3c0c35d51b7ac7642d1be7ab4b0f260ac93b60c9	2019-08-19 06:23:33 -07:00
Ralf Gommers	a3b8607811	Fix test_jit_cuda_archflags failure on py27 due to changing dict order. (#24501 ) Summary: See gh-23408. Was failing for `pytorch_linux_xenial_cuda9_cudnn7_py2_test`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/24501 Differential Revision: D16860932 Pulled By: soumith fbshipit-source-id: 715858d905f74a23e42a9a1da97f036a3e30f0c9	2019-08-16 12:44:16 -07:00
Ralf Gommers	cd20773701	Set CUDA arch correctly when building with torch.utils.cpp_extension (#23408 ) Summary: The old behavior was to always use `sm_30`. The new behavior is: - For building via a setup.py, check if `'arch'` is in `extra_compile_args`. If so, don't change anything. - If `TORCH_CUDA_ARCH_LIST` is set, respect that (can be 1 or more arches) - Otherwise, query device capability and use that. To test this, for example on a machine with `torch` installed for py37: ``` $ git clone https://github.com/pytorch/extension-cpp.git $ cd extension-cpp/cuda $ python setup.py install $ cuobjdump --list-elf build/lib.linux-x86_64-3.7/lltm_cuda.cpython-37m-x86_64-linux-gnu.so ELF file 1: lltm.1.sm_61.cubin ``` Existing tests in `test_cpp_extension.py` for `load_inline` and for compiling via `setup.py` in test/cpp_extensions/ cover this. Closes gh-18657 EDIT: some more tests: ``` from torch.utils.cpp_extension import load lltm = load(name='lltm', sources=['lltm_cuda.cpp', 'lltm_cuda_kernel.cu']) ``` ``` # with TORCH_CUDA_ARCH_LIST undefined or an empty string $ cuobjdump --list-elf /tmp/torch_extensions/lltm/lltm.so ELF file 1: lltm.1.sm_61.cubin # with TORCH_CUDA_ARCH_LIST = "3.5 5.2 6.0 6.1 7.0+PTX" $ cuobjdump --list-elf build/lib.linux-x86_64-3.7/lltm_cuda.cpython-37m-x86_64-linux-gnu.so ELF file 1: lltm_cuda.cpython-37m-x86_64-linux-gnu.1.sm_35.cubin ELF file 2: lltm_cuda.cpython-37m-x86_64-linux-gnu.2.sm_52.cubin ELF file 3: lltm_cuda.cpython-37m-x86_64-linux-gnu.3.sm_60.cubin ELF file 4: lltm_cuda.cpython-37m-x86_64-linux-gnu.4.sm_61.cubin ELF file 5: lltm_cuda.cpython-37m-x86_64-linux-gnu.5.sm_70.cubin ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/23408 Differential Revision: D16784110 Pulled By: soumith fbshipit-source-id: 69ba09e235e4f906b959fd20322c69303240ee7e	2019-08-15 15:25:15 -07:00
Ralf Gommers	81e46d4f78	Fix build issue. CUDA may be installed in `$CUDA_HOME/lib` on macOS. (#23491 ) Summary: Closes gh-16955. Closes https://github.com/pytorch/vision/issues/977 On Linux both `lib64` and `lib` may be present (symlinked). The reports seem to all be about macOS, but it seems like this is also possibly more robust on Linux and can't hurt. So not treating platforms differently. Note that Eigen has a similar check in its CMake: ``` if(CUDA_64_BIT_DEVICE_CODE AND (EXISTS "${CUDA_TOOLKIT_ROOT_DIR}/lib64")) link_directories("${CUDA_TOOLKIT_ROOT_DIR}/lib64") else() link_directories("${CUDA_TOOLKIT_ROOT_DIR}/lib") endif() ``` There may be other issues for building from source on macOS, can't test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23491 Differential Revision: D16538973 Pulled By: soumith fbshipit-source-id: cc309347b7d16e718e06878d3824d0a6e40b1019	2019-07-29 08:08:43 -07:00
peter	54c280863c	Add some compiler flags for building cpp extensions on Windows (#23472 ) Summary: (1) Add `COMMON_MSVC_FLAGS` to the flags in the ninja codepath (2) Add `/EHsc` to `COMMON_MSVC_FLAG` (3) Remove `-fPIC` and `-std=c++11` from the flags in the windows codepath Pull Request resolved: https://github.com/pytorch/pytorch/pull/23472 Differential Revision: D16532993 Pulled By: soumith fbshipit-source-id: bc2d983f5f8b4eae9c7385bf170f155679e92e87	2019-07-28 20:33:18 -07:00
Ralf Gommers	34f53564b4	Don't warn when using conda compilers with utils.cpp_extension (#23396 ) Summary: The conda compiler are gcc/c++ 7.3.0, but have custom version strings for clarity: x86_64-conda_cos6-linux-gnu-cc x86_64-conda_cos6-linux-gnu-c++ Using these compilers to build a C++ or CUDA extension now gives this warning (unnecessarily): ``` !! WARNING !! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Your compiler (/home/rgommers/anaconda3/envs/pytorch-nightly/bin/x86_64-conda_cos6-linux-gnu-c++) is not compatible with the compiler Pytorch was built with for this platform, which is g++ on linux. ... ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/23396 Differential Revision: D16500637 Pulled By: soumith fbshipit-source-id: 5b2fc3593e22e9a7d07dc2c0456dbb4934ffddb2	2019-07-26 10:17:14 -07:00
HaoTang@descartes	0ea8e61f03	For consistent CUDA_HOME behavior (#22845 ) Summary: Align the behavior of `torch.utils.cpp_extension.CUDA_HOME` with that of `tools.setup_helpers.cuda.CUDA_HOME`. Typically, I swapped the position of guess 2 and guess 3 in `torch.utils.cpp_extension.CUDA_HOME` . Fixing issue https://github.com/pytorch/pytorch/issues/22844 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22845 Differential Revision: D16276241 Pulled By: zou3519 fbshipit-source-id: 3b62b439b2f794a6f3637a5fee58991f430985fe	2019-07-16 09:55:56 -07:00
Andrew Jones	e2216ada65	Properly formats errors rising up from C++ extension compilation (#22445 ) Summary: Here's a C++ extension with a missing semicolon: ```python torch.utils.cpp_extension.load_inline('test', 'int main() { return 0 }') ``` which currently generates this error ``` RuntimeError: Error building extension 'test_v6': b'[1/2] c++ -MMD -MF main.o.d - DTORCH_EXTENSION_NAME=test_v6 -DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/lib/python3.7/site-packages/torch/include -isystem /opt/conda/lib/python3.7/site- packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.7/site- packages/torch/include/TH -isystem /opt/conda/lib/python3.7/site-packages/torch/include/THC -isystem /opt/conda/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /tmp/torch_extensions/test/main.cpp -o main.o\nFAILED: main.o \nc++ -MMD -MF main.o.d - DTORCH_EXTENSION_NAME=test_v6 -DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/lib/python3.7/site-packages/torch/include -isystem /opt/conda/lib/python3.7/site- packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.7/site- packages/torch/include/TH -isystem /opt/conda/lib/python3.7/site-packages/torch/include/THC -isystem /opt/conda/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /tmp/torch_extensions/test/main.cpp -o main.o\n/tmp/torch_extensions/test/main.cpp: In function \xe2\x80\x98int main()\xe2\x80\x99:\n/tmp/torch_extensions/test/main.cpp:2:23: error: expected \xe2\x80\x98;\xe2\x80\x99 before \xe2\x80\x98}\xe2\x80\x99 token\n int main() { return 0 }\n ^\nninja: build stopped: subcommand failed.\n' ``` After this PR, the error is ``` RuntimeError: Error building extension 'test': [1/2] c++ -MMD -MF main.o.d - DTORCH_EXTENSION_NAME=test -DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/lib/python3.7/site-packages/torch/include -isystem /opt/conda/lib/python3.7/site- packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.7/site- packages/torch/include/TH -isystem /opt/conda/lib/python3.7/site-packages/torch/include/THC -isystem /opt/conda/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /tmp/torch_extensions/test/main.cpp -o main.o FAILED: main.o c++ -MMD -MF main.o.d -DTORCH_EXTENSION_NAME=test - DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/lib/python3.7/site- packages/torch/include -isystem /opt/conda/lib/python3.7/site- packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.7/site- packages/torch/include/TH -isystem /opt/conda/lib/python3.7/site-packages/torch/include/THC -isystem /opt/conda/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /tmp/torch_extensions/test/main.cpp -o main.o /tmp/torch_extensions/test/main.cpp: In function ‘int main()’: /tmp/torch_extensions/test/main.cpp:2:23: error: expected ‘;’ before ‘}’ token int main() { return 0 } ^ ninja: build stopped: subcommand failed. ``` which is a lot easier to read. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22445 Differential Revision: D16094205 Pulled By: ezyang fbshipit-source-id: 21043344aac260dc3e4e04d6a42898507bb840e4	2019-07-09 16:41:42 -07:00
peter	94bd5ddf7f	Add some essentials for building c++ extensions on Windows (#22563 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/22489. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22563 Differential Revision: D16142615 Pulled By: ezyang fbshipit-source-id: d7c27a874f788dd27065fad6699485e4a6372ec4	2019-07-06 19:29:25 -07:00
Hong Xu	693871ded3	Rename macros and build options NAMEDTENSOR_ENABLED to BUILD_NAMEDTENSOR (#22360 ) Summary: Currently the build system accepts USE_NAMEDTENSOR from the environment variable and turns it into NAMEDTENSOR_ENABLED when passing to CMake. This discrepancy does not seem necessary and complicates the build system. The naming of this build option is also semantically incorrect ("BUILD_" vis-a-vis "USE_"). This commit eradicate this issue before it is made into a stable release. The support of NO_NAMEDTENSOR is also removed, since PyTorch has been quite inconsistent about "NO_*" build options. --- Note: All environment variables with their names starting with `BUILD_` are currently automatically passed to CMake with no need of an additional wrapper. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22360 Differential Revision: D16074509 Pulled By: zou3519 fbshipit-source-id: dc316287e26192118f3c99b945454bc50535b2ae	2019-07-02 11:46:13 -07:00
Karl Ostmo	49481d576d	Torch rename (#20774 ) Summary: This renames the CMake `caffe2` target to `torch`, as well as renaming `caffe2_gpu` to `torch_gpu` (and likewise for other gpu target variants). Many intermediate variables that don't manifest as artifacts of the build remain for now with the "caffe2" name; a complete purge of `caffe2` from CMake variable names is beyond the scope of this PR. The shell `libtorch` library that had been introduced as a stopgap in https://github.com/pytorch/pytorch/issues/17783 is again flattened in this PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20774 Differential Revision: D15769965 Pulled By: kostmo fbshipit-source-id: b86e8c410099f90be0468e30176207d3ad40c821	2019-06-12 20:12:34 -07:00
Richard Zou	835a6b9da2	Fix namedtensor build (#21609 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21609 ghimport-source-id: 648a0bcd28db2cdda1bf2fa6a904ca8f851088c2 Differential Revision: D15747687 Pulled By: zou3519 fbshipit-source-id: 2a972a15fa7399391617fc6e6b19879b86568c3a	2019-06-11 06:53:50 -07:00
Clément Pinard	f8aa6a8f44	Make a deep copy of extra_compile_flag dictionnary (#20221 ) Summary: See issue #20169 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20221 Differential Revision: D15317126 Pulled By: ezyang fbshipit-source-id: 0a12932db4f6ba15ea1d558fa329ce23fe2baef6	2019-05-13 08:11:39 -07:00
Karl Ostmo	4ba28deb6e	Unify libtorch and libcaffe2 (#17783 ) Summary: This PR is an intermediate step toward the ultimate goal of eliminating "caffe2" in favor of "torch". This PR moves all of the files that had constituted "libtorch.so" into the "libcaffe2.so" library, and wraps "libcaffe2.so" with a shell library named "libtorch.so". This means that, for now, `caffe2/CMakeLists.txt` becomes a lot bigger, and `torch/CMakeLists.txt` becomes smaller. The torch Python bindings (`torch_python.so`) still remain in `torch/CMakeLists.txt`. The follow-up to this PR will rename references to `caffe2` to `torch`, and flatten the shell into one library. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17783 Differential Revision: D15284178 Pulled By: kostmo fbshipit-source-id: a08387d735ae20652527ced4e69fd75b8ff88b05	2019-05-10 09:50:53 -07:00
peter	3bfdffe487	Fix default CXX for Windows in cpp_extensions.py (#19052 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/19017. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19052 Differential Revision: D14846702 Pulled By: soumith fbshipit-source-id: b0e4dadaa749da0fa2d0405a1a064820d094220a	2019-04-08 23:14:22 -07:00
Soumith Chintala	e0c593eae7	detect C++ ABI flag for cpp extensions from available runtime information (#18994 ) Summary: Previously, when a user built PyTorch from source, but set the version string manually to be binary-formatted, it would've simply used CXX11_ABI=0 incorrectly. We have this information available at runtime with `torch._C._GLIBCXX_USE_CXX11_ABI`, so this PR improves the situation by simply using that information. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18994 Differential Revision: D14839393 Pulled By: soumith fbshipit-source-id: ca92e0810b29ffe688be82326e02a64a5649a3ad	2019-04-08 17:50:03 -07:00
mooncake4132	d6d0fcc92b	Add c10_cuda to libraries in CUDAExtension for Windows (#18982 ) Summary: This change was necessary for me to compile [apex](https://github.com/NVIDIA/apex) on Windows. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18982 Differential Revision: D14819818 Pulled By: soumith fbshipit-source-id: 37ff9b93a72ab2b7c87f23a61e9f776c71c4c1a8	2019-04-06 10:30:51 -07:00
BloodAxe	5ade96fc84	Update cpp_extension.py (#18638 ) Summary: Hi. It seems that when building CPP-extensions with CUDA for Windows, an `extra_cuda_cflags` options are not properly forwarded to `nvcc`. Use of extra CUDA options is necessary to build, for instance, a InplaceABN (https://github.com/mapillary/inplace_abn), which requires `--expt-extended-lambda` option. This PR adds one line that correctly appends `extra_cuda_cflags`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18638 Differential Revision: D14704270 Pulled By: ezyang fbshipit-source-id: e1e330d193d9afd5707a5437a74c0499460d2b90	2019-04-02 07:56:38 -07:00
Thomas Viehmann	2b7a5d1876	don't include /usr/include when nvcc is in /usr/bin (#18127 ) Summary: ...because gcc will have failures with very strange error messages if you do. This affects people with Debian/Ubuntu-provided NVCC, the PR should not change anything for anyone else. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18127 Differential Revision: D14504386 Pulled By: soumith fbshipit-source-id: 1aea168723cdc71cdcfffb3193ee116108ae755e	2019-03-18 12:18:27 -07:00
peterjc123	fe90ee9dc8	Add /MD to prevent linking errors on Windows Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17799 Differential Revision: D14385777 Pulled By: ezyang fbshipit-source-id: 8c1d9f80c48399087f5fae4474690e6d80d740e6	2019-03-08 10:46:25 -08:00
peter	c78da0c6ed	Enable using CMD when building cpp extensions on Windows Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17706 Differential Revision: D14346482 Pulled By: ezyang fbshipit-source-id: 7c85e51c701f6c0947ad324ef19fafda40ae1cb9	2019-03-06 14:45:31 -08:00
Zachary DeVito	21193bf123	try to get rid of tmp_install (#16414 ) Summary: Rehash of previous attempts. This tries a different approach where we accept the install as specified in cmake (leaving bin/ include/ and lib/ alone), and then try to adjust the rest of the files to this more standard layout. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16414 Differential Revision: D13863635 Pulled By: zdevito fbshipit-source-id: 23725f5c64d7509bf3ca8f472dcdcad074de9828	2019-01-29 17:29:40 -08:00
Jon Crall	c7ec7cdd46	Fixed syntax error in doctest (#15646 ) Summary: I fixed a very small extra parenthesis in a doctest. I'm also going to use this issue as a place to propose the eventual inclusion of xdoctest (a pip installable library I wrote) in pytorch's test suite. I think there are a lot of problems with Python's built in doctest module, and I've built xdoctest to fix them. I would love for my project to get some exposure and its addition to PyTorch may benefit both projects. Please see the readme for more details on what xdoctest brings to the table over the builtin doctest module: https://github.com/Erotemic/xdoctest I came across this small syntax error when working on ensuring xdoctest was compatible with pytorch. It isn't 100% there yet, but I'm working on it. My goal is to ensure that xdoctest is 100% compatible with all of torch's doctest out-of-the-box before writing up the PR. I'm also airing the idea out-loud before I commit too much time into this (or get my hopes up), so I'm attaching this little blurb to a no-brainer-merge PR to (1) demonstrate a little bit of value (because xdoctest flagged this syntax error) and (2) see how its received. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15646 Differential Revision: D13606111 Pulled By: soumith fbshipit-source-id: d4492801a38ee0ae64ea0326a83239cee4d811a4	2019-01-09 01:29:11 -08:00

1 2 3 4 5

207 Commits