Commit Graph

350 Commits

Author SHA1 Message Date
PyTorch MergeBot
aef9e43fe6 Revert "Replaced deprecated pkg_resources.packaging with packaging module (#113023)"
This reverts commit 81ea7a489a.

Reverted https://github.com/pytorch/pytorch/pull/113023 on behalf of https://github.com/atalman due to breaks nightlies ([comment](https://github.com/pytorch/pytorch/pull/113023#issuecomment-1802720774))
2023-11-08 21:39:59 +00:00
Alexander Grund
21b6030ac3 Don't set CUDA_HOME when not compiled with CUDA support (#106310)
It doesn't make sense to set this (on import!) as CUDA cannot be used with PyTorch in this case but leads to messages like
> No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
when CUDA happens to be installed which is at least confusing.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106310
Approved by: https://github.com/ezyang
2023-11-06 21:48:49 +00:00
vfdev
81ea7a489a Replaced deprecated pkg_resources.packaging with packaging module (#113023)
Usage of `from pkg_resources import packaging` leads to a deprecation warning:
```
DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
```
and in strict tests where warnings are errors, this leads to CI breaks, e.g.: https://github.com/pytorch/vision/pull/8092

Replacing `pkg_resources.package` with `package` as it is now a pytorch dependency:
fa9045a872/requirements.txt (L19)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113023
Approved by: https://github.com/Skylion007
2023-11-06 20:26:32 +00:00
Shaun Walbridge
0adb28b77d Show CUDAExtension example commands as code (#112764)
The default rendering of these code snippets renders the `TORCH_CUDA_ARCH_LIST` values with typographic quotes which prevent the examples from being directly copyable. Use code style for the two extension examples.

Fixes #112763
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112764
Approved by: https://github.com/malfet
2023-11-02 21:47:50 +00:00
Jeff Daily
28c0b07d19 [ROCm] remove HCC references (#111975)
- rename `__HIP_PLATFORM_HCC__` to `__HIP_PLATFORM_AMD__`
- rename `HIP_HCC_FLAGS` to `HIP_CLANG_FLAGS`
- rename `PYTORCH_HIP_HCC_LIBRARIES` to `PYTORCH_HIP_LIBRARIES`
- workaround in tools/amd_build/build_amd.py until submodules are updated

These symbols have had a long deprecation cycle and will finally be removed in ROCm 6.0.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111975
Approved by: https://github.com/ezyang, https://github.com/hongxiayang
2023-10-26 02:39:10 +00:00
Aleksei Nikiforov
ba04d84089 S390x inductor support (#111367)
Use arch compile flags. They are needed for vectorization support on s390x.
Implement new helper functions for inductor.

This change fixes multiple tests in test_cpu_repro.py

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111367
Approved by: https://github.com/ezyang
2023-10-20 19:38:46 +00:00
Aaron Gokaslan
cb856b08b2 [BE]: Attach cause to some exceptions and enable RUFF TRY200 (#111496)
Did some easy fixes from enabling TRY200. Most of these seem like oversights instead of intentional. The proper way to silence intentional errors is with `from None` to note that you thought about whether it should contain the cause and decided against it.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111496
Approved by: https://github.com/malfet
2023-10-19 21:56:36 +00:00
Kent Gauen
bb89a9e48c Skipped CUDA Flags if C++ Extension Name includes "arch" Substring (#111211)
The CUDA architecture flags from TORCH_CUDA_ARCH_LIST will be skipped if the TORCH_EXTENSION_NAME includes the substring "arch". A C++ Extension should be allowed to have any name. I just manually skip the TORCH_EXTENSION_NAME flag when checking if one of the flags is "arch". There is probably a better fix, but I'll leave this to experts.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111211
Approved by: https://github.com/ezyang
2023-10-14 00:10:01 +00:00
Dmytro Dzhulgakov
a0cea517e7 Add 9.0a to cpp_extension supported compute archs (#110587)
There's an extended compute capability 9.0a for Hopper that was introduced in Cuda 12.0: https://docs.nvidia.com/cuda/archive/12.0.0/cuda-compiler-driver-nvcc/index.html#gpu-feature-list

E.g. Cutlass leverages it: 5f13dcad78/python/cutlass/emit/pytorch.py (L684)

This adds it to the list of permitted architectures to use in `cpp_extension` directly.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110587
Approved by: https://github.com/ezyang
2023-10-05 17:41:06 +00:00
QuarticCat
20812d69e5 Fix extension rebuilding on Linux (#108613)
On Linux, CUDA header dependencies are not correctly tracked. After you modify a CUDA header, affected CUDA files won't be rebuilt. This PR will fix this problem.

```console
$ ninja -t deps
rep_penalty.o: #deps 2, deps mtime 1693956351892493247 (VALID)
    /home/qc/Workspace/NotMe/exllama/exllama_ext/cpu_func/rep_penalty.cpp
    /home/qc/Workspace/NotMe/exllama/exllama_ext/cpu_func/rep_penalty.h

rms_norm.cuda.o: #deps 0, deps mtime 1693961188871054130 (VALID)

rope.cuda.o: #deps 0, deps mtime 1693961188954388632 (VALID)

cuda_buffers.cuda.o: #deps 0, deps mtime 1693961188797719768 (VALID)

...
```

Historically, this line of code has been changed twice. It was first implemented in #49344 and there's no `if IS_WINDOWS`, just like now. Then in #56015 someone added `if IS_WINDOWS` for unknown reason. That PR has no description so I don't know what bug he encountered. I don't think there's any bug with these flags on Linux, at least for today. CMake generates exactly the same flags for CUDA.

```ninja
#############################################
# Rule for compiling CUDA files.

rule CUDA_COMPILER__cpp_cuda_unscanned_Debug
  depfile = $DEP_FILE
  deps = gcc
  command = ${LAUNCHER}${CODE_CHECK}/opt/cuda/bin/nvcc -forward-unknown-to-host-compiler $DEFINES $INCLUDES $FLAGS -MD -MT $out -MF $DEP_FILE -x cu -c $in -o $out
  description = Building CUDA object $out
```

where `-MD` is short for `--generate-dependencies-with-compile` and `-MF` is short for `--dependency-output`. My words can be verified by `nvcc --help`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108613
Approved by: https://github.com/ezyang
2023-09-06 17:58:21 +00:00
Aaron Gokaslan
660e8060ad [BE]: Update ruff to 0.285 (#107519)
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.

I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
2023-08-22 23:16:38 +00:00
PyTorch MergeBot
d59a6864fb Revert "[BE]: Update ruff to 0.285 (#107519)"
This reverts commit 88ab3e4322.

Reverted https://github.com/pytorch/pytorch/pull/107519 on behalf of https://github.com/ZainRizvi due to Sorry, but this PR breaks internal tests. @ezyang, can you please hep them get unblocked? It seems like one of the strings was prob accidentally modified ([comment](https://github.com/pytorch/pytorch/pull/107519#issuecomment-1688833480))
2023-08-22 19:53:32 +00:00
Xu Han
3f3479e85e reduce header file to boost cpp_wrapper build. (#107585)
1. Reduce cpp_wrapper un-used header files.
2. Clean pch cache, when use_pch is False.

The first change will reduce the build time from 7.35s to 4.94s.

Before change:
![image](https://github.com/pytorch/pytorch/assets/8433590/fc5c1d37-ec40-44f3-8d4d-bf26bdc674bb)
After change:
![image](https://github.com/pytorch/pytorch/assets/8433590/c7ccadd2-bf3a-4d30-bf56-6e3b0230a194)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107585
Approved by: https://github.com/ezyang, https://github.com/jansel, https://github.com/jgong5
2023-08-22 11:58:47 +00:00
Han, Xu
5ed60477a7 Optimize load inline via pch (#106696)
Add PreCompiled Header(PCH) to reduce load_inline build time.
PCH is gcc built-in mechanism: https://gcc.gnu.org/onlinedocs/gcc-4.0.4/gcc/Precompiled-Headers.html

Add PCH for '#include <torch/extension.h>'. This file will used in all load_inline modules. All load_inline modules can take benifit from this PR.

Changes:
1. Add PCH signature to guarantee PCH(gch) file take effect.
2. Unification get cxx compiler funtions.
3. Unification get build flags funtions.

Before this PR:
![image](https://github.com/pytorch/pytorch/assets/8433590/f190cdcb-236c-4312-b165-d419a7efafe3)

Added this PR:
![image](https://github.com/pytorch/pytorch/assets/8433590/b45c5ad3-e902-4fc8-b450-743cf73505a4)

Compiling time is reduced from 14.06s to 7.36s.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106696
Approved by: https://github.com/jgong5, https://github.com/jansel
2023-08-21 10:08:30 +00:00
Aaron Gokaslan
88ab3e4322 [BE]: Update ruff to 0.285 (#107519)
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.

I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
2023-08-20 01:36:18 +00:00
Nikita Shulga
bcc0f4bcab Move ASAN to clang12 and Ubuntu-22.04 (Jammy) (#106355)
- Modify `install_conda` to remove libstdc++ from libstdcxx-ng to use one from OS
- Modify `install_torchvision` to workaround weird glibc bug, where malloc interposers (such as ASAN) are causing a hang in internationalization library, see https://sourceware.org/bugzilla/show_bug.cgi?id=27653 and https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90589
- Modify `torch.utils.cpp_extension` to recognize Ubuntu's clang as supported compiler

Extracted from https://github.com/pytorch/pytorch/pull/105260
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106355
Approved by: https://github.com/huydhn
ghstack dependencies: #106354
2023-08-03 05:36:04 +00:00
Justin Chu
4cc1745b13 [BE] f-stringify torch/ and scripts (#105538)
This PR is a follow up on the pyupgrade series to convert more strings to use f-strings using `flynt`.

- https://docs.python.org/3/reference/lexical_analysis.html#f-strings
- https://pypi.org/project/flynt/

Command used:

```
flynt torch/ -ll 120
flynt scripts/ -ll 120
flynt tools/ -ll 120
```

and excluded `collect_env.py`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105538
Approved by: https://github.com/ezyang, https://github.com/malfet
2023-07-21 19:35:24 +00:00
Justin Chu
abc1cadddb [BE] Enable ruff's UP rules and autoformat utils/ (#105424)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105424
Approved by: https://github.com/ezyang, https://github.com/malfet
2023-07-18 20:17:25 +00:00
lcskrishna
004ff536e8 [ROCm] Fix circular recursion issue in hipification (#104085)
This PR fixes the circular issue during hipification process by introducing current_state to track whether a file is processed for hipification. (Iterative DFS)
The issue arises when two header files try to include themselves, which leads to a circular recursion or an infinite loop.

Fixes the related issues such as :
https://github.com/pytorch/pytorch/issues/93827
https://github.com/ROCmSoftwarePlatform/hipify_torch/issues/39

Error log:
```
  File "/opt/conda/lib/python3.8/posixpath.py", line 471, in relpath
    start_list = [x for x in abspath(start).split(sep) if x]
  File "/opt/conda/lib/python3.8/posixpath.py", line 375, in abspath
    if not isabs(path):
  File "/opt/conda/lib/python3.8/posixpath.py", line 63, in isabs
    sep = _get_sep(s)
  File "/opt/conda/lib/python3.8/posixpath.py", line 42, in _get_sep
    if isinstance(path, bytes):
RecursionError: maximum recursion depth exceeded while calling a Python object
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104085
Approved by: https://github.com/jithunnair-amd, https://github.com/malfet
2023-07-01 03:25:51 +00:00
Felix Erkinger
e140c9cc92 Fixes ROCM_HOME detection in case no hipcc is found in path (#95634)
if ROCM_HOME is not set as environment variable,
it tries to find hipcc in the path,
but fails with an empty string instead of an exception,
returning an empty string instead of harcoded '/opt/rocm' as third case

Fixes #95633

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95634
Approved by: https://github.com/jithunnair-amd, https://github.com/ezyang
2023-06-28 19:39:26 +00:00
albanD
b81f1d1bee Speed up cpp extensions re-compilation (#104280)
Fixes https://github.com/pytorch/pytorch/issues/68066 to a large extend.

This is achieved by not touching files that don't need changing to make sure the ninja caching works as expected.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104280
Approved by: https://github.com/fmassa
2023-06-28 17:06:07 +00:00
Nikita Shulga
347463fddf [cpp-extensions] Add clang to the list of supported Linux compilers (#103349)
Not sure, why was it excluded previous (oversight I guess).
Also, please note, that `clang++` is already considered acceptable compiler (as it ends with `g++` ;))

<!--
copilot:poem
-->
### <samp>🤖 Generated by Copilot at 55aa7db</samp>

> _`clang` or `gcc`, we don't care what you use_
> _We'll build our extensions with the tools we choose_
> _Don't try to stop us with your version string_
> _We'll update our logic and make our code sing_
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103349
Approved by: https://github.com/seemethere
2023-06-10 02:53:38 +00:00
Li-Huai (Allan) Lin
3c0072e7c0 [MPS] Prerequisite for MPS C++ extension (#102483)
in order to add mps kernels to torchvision codebase, we need to expose mps headers and allow objc++ files used in extensions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102483
Approved by: https://github.com/malfet
2023-06-07 17:28:31 +00:00
Matthew Hoffman
29da75cc55 Enable mypy allow redefinition (#102046)
Related #101528

I tried to enable this in another PR but it uncovered a bunch of type errors: https://github.com/pytorch/pytorch/actions/runs/4999748262/jobs/8956555243?pr=101528#step:10:1305

The goal of this PR is to fix these errors.

---

This PR enables [allow_redefinition = True](https://mypy.readthedocs.io/en/stable/config_file.html#confval-allow_redefinition) in `mypy.ini`, which allows for a common pattern:

> Allows variables to be redefined with an arbitrary type, as long as the redefinition is in the same block and nesting level as the original definition.

`allow_redefinition` allows mypy to be more flexible by allowing reassignment to an existing variable with a different type... for instance (from the linked PR):

4a1e9230ba/torch/nn/parallel/data_parallel.py (L213)

A `Sequence[Union[int, torch.device]]` is narrowed to `Sequence[int]` thru reassignment to the same variable.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102046
Approved by: https://github.com/ezyang
2023-05-24 07:05:30 +00:00
pminimd
59a3759d97 Update cpp_extension.py (#101285)
When we need to link extra libs, we should notice that 64-bit CUDA may be installed in "lib", not in "lib64".

<!--
copilot:summary
-->
### <samp>🤖 Generated by Copilot at 05c1ca6</samp>

Improve CUDA compatibility in `torch.utils.cpp_extension` by checking for `lib64` or `lib` directory.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101285
Approved by: https://github.com/ezyang, https://github.com/malfet
2023-05-15 22:47:41 +00:00
Richard Barnes
5f92909faf Use correct standard when compiling NVCC on Windows (#100031)
Test Plan: Sandcastle

Differential Revision: D45129001

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100031
Approved by: https://github.com/ngimel
2023-05-01 16:28:23 +00:00
Aaron Gokaslan
e2a3817dfd [BE] Enable C419 rule for any all shortcircuiting (#99890)
Apparently https://github.com/pytorch/pytorch/pull/78142 made torch.JIT allow for simple generator expressions which allows us to enable rules that replace unnecessary list comprehensions with generators in any/all. This was originally part of #99280 but I split it off into this PR so that it can be easily reverted should anything break.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99890
Approved by: https://github.com/justinchuby, https://github.com/kit1980, https://github.com/malfet
2023-04-25 15:02:13 +00:00
PyTorch MergeBot
cfacb5eaaa Revert "Use correct standard when compiling NVCC on Windows (#99492)"
This reverts commit db6944562e.

Reverted https://github.com/pytorch/pytorch/pull/99492 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally
2023-04-19 20:51:26 +00:00
Richard Barnes
db6944562e Use correct standard when compiling NVCC on Windows (#99492)
Test Plan: Sandcastle

Reviewed By: malfet

Differential Revision: D45108690

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99492
Approved by: https://github.com/ezyang
2023-04-19 20:36:05 +00:00
Pruthvi Madugundu
08f125bcac [ROCm] Remove usage of deprecated ROCm component header includes (#97620)
- clang parameter 'amdgpu-target' changed to 'offload-arch'
- HIP and MIOpen includes path updated for extensions

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97620
Approved by: https://github.com/ezyang, https://github.com/jithunnair-amd
2023-03-28 19:28:38 +00:00
Stas Bekman
8275e5d2a8 [cpp_extension.py] fix bogus _check_cuda_version (#97602)
Currently if `setuptools<49.4.0` and there is a minor version mismatch `_check_cuda_version` fails with a misleading non-actionable error:
```
2023-03-24T20:21:35.0625644Z   RuntimeError:
2023-03-24T20:21:35.0628441Z   The detected CUDA version (11.2) mismatches the version that was used to compile
2023-03-24T20:21:35.0630681Z   PyTorch (11.3). Please make sure to use the same CUDA versions.
```
This condition shouldn't be failing since minor version match isn't required.

It fails because the other condition to have a certain version of `setuptools` isn't met. But that condition is written in a comment (!!!). So this PR changes it to actually tell the user how to fix the problem.

While at it, I adjusted the version number as a lower `setuptools>=49.4.0` is sufficient for this to work.

Thanks.

p.s. this problem manifests on `nvidia/cuda:11.2.2-cudnn8-devel-ubuntu20.04` docker image.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97602
Approved by: https://github.com/ezyang
2023-03-27 15:15:57 +00:00
mikey dagitses
461f088c96 add -std=c++17 to windows cuda compilations (#97515)
add -std=c++17 to windows cuda compilations

Summary:
We're using C++17 in headers that are compiled by C++
extensions. Support for this was not added when we upgraded to C++17.

Test Plan: Rely on CI.

---
Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/pytorch/pytorch/pull/97515).
* #97175
* __->__ #97515
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97515
Approved by: https://github.com/ezyang
2023-03-26 15:23:52 +00:00
Kazuaki Ishizaki
622a11d512 Fix typos under torch/utils directory (#97516)
This PR fixes typos in comments and messages of `.py` files under `torch/utils` directory

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97516
Approved by: https://github.com/ezyang
2023-03-24 16:53:39 +00:00
mikey dagitses
bcff4773da add /std:c++17 to windows compilations when not using Ninja (#97445)
add /std:c++17 to windows compilations when not using Ninja

Summary:
This was overlooked when we upgraded to C++17.

Test Plan: Rely on CI.

Reviewers: ezyang

Subscribers:

Tasks:

Tags:

---
Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/pytorch/pytorch/pull/97445).
* #96603
* #97473
* #97175
* #97515
* __->__ #97445
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97445
Approved by: https://github.com/ezyang
2023-03-24 14:52:29 +00:00
mikey dagitses
bdaf402565 build C++ extensions on windows with /std:c++17 (#97413)
build C++ extensions on windows with /std:c++17

Summary:
We added -std=c++17 to Posix builds, but neglected to add this for
Windows. This just brings back parity.

Test Plan: Rely on CI.

Reviewers: ezyang

Subscribers:

Tasks:

Tags:

---
Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/pytorch/pytorch/pull/97413).
* #97175
* __->__ #97413
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97413
Approved by: https://github.com/ezyang
2023-03-23 13:31:29 +00:00
Xiao Wang
44d7bbfe22 [cpp extension] Allow setting PYTORCH_NVCC to a customized nvcc in torch cpp extension build (#96987)
per title

I can write a script named `nvcc` like this
```bash
#!/bin/bash
/opt/cache/bin/sccache /usr/local/cuda/bin/nvcc $@
```
and set its path to `PYTORCH_NVCC` (added in this PR), along with another `sccache-g++` script to env var `CXX`.
cfa6b52e02/torch/utils/cpp_extension.py (L2106-L2109)

With ninja, I can fully enable c-cached build on my cuda extensions.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96987
Approved by: https://github.com/ezyang
2023-03-17 17:05:17 +00:00
Aaron Gokaslan
dd5e6e8553 [BE]: Merge startswith calls - rule PIE810 (#96754)
Merges startswith, endswith calls to into a single call that feeds in a tuple. Not only are these calls more readable, but it will be more efficient as it iterates through each string only once.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96754
Approved by: https://github.com/ezyang
2023-03-14 22:05:20 +00:00
cyy
a32be76a53 Disable more warnings on Windows CI test (#95933)
These warnings are disabled to avoid long log on Windows tests. They are also disabled on CMake buildings currently.
'/wd4624': MSVC complains  "destructor was implicitly defined as delete" on c10::optional and other templates
'/wd4076': "unexpected tokens following preprocessor directive - expected a newline" on some header
'/wd4068': "The compiler ignored an unrecognized [pragma]"

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95933
Approved by: https://github.com/ezyang
2023-03-03 07:11:13 +00:00
Eddie Yan
db8e91ef73 [CUDA] Split out compute capability 8.7 and 7.2 from others (#95803)
Follow up of #95008 to avoid building Jetson compute capabilities unnecessarily, also adds missing 7.2.

CC @ptrblck @malfet
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95803
Approved by: https://github.com/ezyang
2023-03-02 14:13:15 +00:00
Eddie Yan
13ebffe088 [CUDA] sm_87 / Jetson Orin support (#95008)
Surfaced from #94438 CC @ptrblck @ngimel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95008
Approved by: https://github.com/ezyang
2023-02-17 02:22:23 +00:00
PyTorch MergeBot
36dfbb08f3 Revert "Update Cutlass to v2.11 (#94188)"
This reverts commit a0f9abdcb6.

Reverted https://github.com/pytorch/pytorch/pull/94188 on behalf of https://github.com/ezyang due to bouncing this to derisk branch cut
2023-02-13 19:03:36 +00:00
Aaron Gokaslan
a0f9abdcb6 Update Cutlass to v2.11 (#94188)
Now that we are on CUDA 11+ exclusively, we can update Nvidia's Cutlass to the next version. We also had to remove the cuda build flag : "-D__CUDA_NO_HALF_CONVERSIONS__" since Cutlass no longer builds without it.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94188
Approved by: https://github.com/ezyang, https://github.com/jansel
2023-02-12 20:45:03 +00:00
Aaron Gokaslan
67d9790985 [BE] Apply almost all remaining flake8-comprehension checks (#94676)
Applies the remaining flake8-comprehension fixes and checks. This changes replace all remaining unnecessary generator expressions with list/dict/set comprehensions which are more succinct, performant, and better supported by our torch.jit compiler. It also removes useless generators such as 'set(a for a in b)`, resolving it into just the set call.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94676
Approved by: https://github.com/ezyang
2023-02-12 01:01:25 +00:00
Xuehai Pan
5b1cedacde [BE] [2/3] Rewrite super() calls in functorch and torch (#94588)
Rewrite Python built-in class `super()` calls. Only non-semantic changes should be applied.

- #94587
- #94588
- #94592

Also, methods with only a `super()` call are removed:

```diff
class MyModule(nn.Module):
-   def __init__(self):
-       super().__init__()
-
    def forward(self, ...):
        ...
```

Some cases that change the semantics should be kept unchanged. E.g.:

f152a79be9/caffe2/python/net_printer.py (L184-L190)

f152a79be9/test/test_jit_fuser_te.py (L2628-L2635)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94588
Approved by: https://github.com/ezyang, https://github.com/albanD
2023-02-10 21:16:33 +00:00
Aaron Gokaslan
3ce1ebb6fb Apply some safe comprehension optimizations (#94323)
Optimize unnecessary collection cast calls, unnecessary calls to list, tuple, and dict, and simplify calls to the sorted builtin. This should strictly improve speed and improve readability.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94323
Approved by: https://github.com/albanD
2023-02-07 23:53:46 +00:00
Aaron Gokaslan
8fce9a09cd [BE]: pyupgrade Python to 3.8 - imports and object inheritance only (#94308)
Apply parts of pyupgrade to torch (starting with the safest changes).
This PR only does two things: removes the need to inherit from object and removes unused future imports.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94308
Approved by: https://github.com/ezyang, https://github.com/albanD
2023-02-07 21:10:56 +00:00
bxia
70b3ea59ae [ROCM] Modify transcoding: absolute path ->relative path (#91845)
Fixes https://github.com/pytorch/pytorch/issues/91797
This PR compiles the transcoded file with a relative path to ensure that the written transcoded file is written to SOURCE.txt as a relative path. Ensure successful packaging.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91845
Approved by: https://github.com/jithunnair-amd, https://github.com/ezyang
2023-01-13 23:00:57 +00:00
cyy
9710ac6531 Some CMake and CUDA cleanup given recent update to C++17 (#90599)
The main changes are:
1. Remove outdated checks for old compiler versions because they can't support C++17.
2. Remove outdated CMake checks because it now requires 3.18.
3. Remove outdated CUDA checks because we are moving to CUDA 11.

Almost all changes are in CMake files for easy audition.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90599
Approved by: https://github.com/soumith
2022-12-30 11:19:26 +00:00
joncrall
ad782ff7df Enable xdoctest runner in CI for real this time (#83816)
Builds on #83317 and enables running the doctests. Just need to figure out what is causing the failures.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/83816
Approved by: https://github.com/ezyang, https://github.com/malfet
2022-12-29 05:32:42 +00:00
Nikita Shulga
36ac095ff8 Migrate PyTorch to C++17 (#85969)
With CUDA-10.2 gone we can finally do it!

This PR mostly contains build system related changes, invasive functional ones are to be followed.
Among many expected tweaks to the build system, here are few unexpected ones:
 - Force onnx_proto project to be updated to C++17 to avoid `duplicate symbols` error when compiled by gcc-7.5.0, as storage rule for `constexpr` changed in C++17, but gcc does not seem to follow it
 - Do not use `std::apply` on CUDA but rely on the built-in variant, as it results in test failures when CUDA runtime picks host rather than device function when `std::apply` is invoked from CUDA code.
 - `std::decay_t` -> `::std::decay_t` and `std::move`->`::std::move` as VC++ for some reason claims that `std` symbol is ambigious
 - Disable use of `std::aligned_alloc` on Android, as its `libc++` does not implement it.

Some prerequisites:
 - https://github.com/pytorch/pytorch/pull/89297
 - https://github.com/pytorch/pytorch/pull/89605
 - https://github.com/pytorch/pytorch/pull/90228
 - https://github.com/pytorch/pytorch/pull/90389
 - https://github.com/pytorch/pytorch/pull/90379
 - https://github.com/pytorch/pytorch/pull/89570
 - https://github.com/facebookincubator/gloo/pull/336
 - https://github.com/facebookincubator/gloo/pull/343
 - 919676fb32

Fixes https://github.com/pytorch/pytorch/issues/56055

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85969
Approved by: https://github.com/ezyang, https://github.com/kulinseth
2022-12-08 02:27:48 +00:00
Alexander Grund
5b51ca6808 Update CUDA compiler matrix (#86360)
Switch GCC/Clang max versions to be exclusive as the `include/crt/host_config.h` checks the major version only for the upper bound. This allows to be less restrictive and match the checks in the aforementioned header.
Also update the versions using that header in the CUDA SDKs.

Follow up to #82860

I noticed this as PyTorch 1.12.1 with CUDA 11.3.1 and GCC 10.3 was failing in the `test_cpp_extensions*` tests.

Example for CUDA 11.3.1 from the SDK header:

```
#if __GNUC__ > 11
// Error out
...
#if (__clang_major__ >= 12) || (__clang_major__ < 3) || ((__clang_major__ == 3) &&  (__clang_minor__ < 3))
// Error out
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86360
Approved by: https://github.com/ezyang
2022-11-23 03:07:22 +00:00
Nikita Shulga
575e02df53 Fix CUDNN_PATH handling on Windows (#88898)
Fixes https://github.com/pytorch/pytorch/issues/88873
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88898
Approved by: https://github.com/kit1980
2022-11-11 21:19:26 +00:00
Eddie Yan
a7420d2ccb Hopper (sm90) support (#87736)
Essentially a followup of #87436

CC @xwang233 @ptrblck
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87736
Approved by: https://github.com/xwang233, https://github.com/malfet
2022-11-09 01:49:50 +00:00
Greg Hogan
71fe069d98 ada lovelace (arch 8.9) support (#87436)
changes required to be able to compile https://github.com/pytorch/vision and https://github.com/nvidia/apex for `sm_89` architecture
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87436
Approved by: https://github.com/ngimel
2022-10-24 21:25:36 +00:00
Nikita Shulga
c28cdb53ea [BE] Delete BUILD_SPLIT_CUDA option (#87502)
As we are linking with cuDNN and cuBLAS dynamically for all configs anyway, as statically linked cuDNN is different library than dynamically linked one, increases default memory footprint, etc, and libtorch_cuda even if compiled for all GPU architectures is no longer approaching 2Gb binary size limit, so BUILD_SPLIT_CUDA can go away.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87502
Approved by: https://github.com/atalman
2022-10-22 06:00:59 +00:00
Alexander Grund
fe87ae692f Fix check_compiler_ok_for_platform on non-English locales (#85891)
The function checks the output of e.g. `c++ -v` for "gcc version". But on another locale than English it might be "gcc-Version" which makes the check fail.
This causes the function to wrongly return false on systems where `c++` is a hardlink to `g++` and the current locale returns another output format.

Fix this by setting `LC_ALL=C`.

I found this as `test_utils.py` was failing in `test_cpp_compiler_is_ok`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85891
Approved by: https://github.com/ezyang
2022-09-29 18:36:36 +00:00
anjali411
0183c1e336 Add __all__ to torch.utils submodules (#85331)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85331
Approved by: https://github.com/albanD
2022-09-27 14:45:26 +00:00
chengscott
1bf2371365 Rename path on Windows from lib/x64 to lib\x64 (#83417)
Use `os.path.join` to join path
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83417
Approved by: https://github.com/ezyang
2022-08-15 14:47:19 +00:00
joncrall
4618371da5 Integrate xdoctest - Rebased (#82797)
This is a new version of #15648 based on the latest master branch.

Unlike the previous PR where I fixed a lot of the doctests in addition to integrating xdoctest, I'm going to reduce the scope here. I'm simply going to integrate xdoctest, and then I'm going to mark all of the failing tests as "SKIP". This will let xdoctest run on the dashboards, provide some value, and still let the dashboards pass. I'll leave fixing the doctests themselves to another PR.

In my initial commit, I do the bare minimum to get something running with failing dashboards. The few tests that I marked as skip are causing segfaults. Running xdoctest results in 293 failed, 201 passed tests. The next commits will be to disable those tests. (unfortunately I don't have a tool that will insert the `#xdoctest: +SKIP` directive over every failing test, so I'm going to do this mostly manually.)

Fixes https://github.com/pytorch/pytorch/issues/71105

@ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82797
Approved by: https://github.com/ezyang
2022-08-12 02:08:01 +00:00
Nikita Shulga
737fa85dd2 Update CUDA compiler matrix (#82860)
Update CUDA compiler versions to match ones defined in
https://docs.nvidia.com/cuda/archive/11.4.1/cuda-installation-guide-linux/index.html#system-requirements
https://docs.nvidia.com/cuda/archive/11.5.0/cuda-installation-guide-linux/index.html#system-requirements
https://docs.nvidia.com/cuda/archive/11.6.0/cuda-installation-guide-linux/index.html#system-requirements
https://docs.nvidia.com/cuda/archive/11.7.0/cuda-installation-guide-linux/index.html#system-requirements

Special case 11.4.0, where maximum GCC supported version are similar to 11.3 rather that to 11.4.1+

Fixes https://github.com/pytorch/pytorch/issues/81039
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82860
Approved by: https://github.com/huydhn
2022-08-06 00:46:30 +00:00
Xuehai Pan
e849ed3d19 Redirect print messages to stderr in torch.utils.cpp_extension (#82097)
### Description
<!-- What did you change and why was it needed? -->

Listed in the commit message:

> The user may want to use `python3 -c "..."` to get the torch library
> path and the include path. Printing messages to stdout will mess up
> the output.

I'm using the command:

```bash
LIBTORCH_PATH="$(
    python3 -c 'print(":".join(__import__("torch.utils.cpp_extension", fromlist=[None]).library_paths()))'
)"
export LD_LIBRARY_PATH="${LIBTORCH_PATH}${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}"
```

To let the command line tools find the torch shared libraries. I think this would be a common use case for users who writing C/C++ extensions.

I got:

```console
$ LIBTORCH_PATH="$(python3 -c 'print(":".join(__import__("torch.utils.cpp_extension", fromlist=[None]).library_paths()))')"

$ export LD_LIBRARY_PATH="${LIBTORCH_PATH}${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}"

$ echo "LD_LIBRARY_PATH=${LD_LIBRARY_PATH}"
LD_LIBRARY_PATH=No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda-11.6'
/opt/hostedtoolcache/Python/3.7.13/x64/lib/python3.7/site-packages/torch/lib:/usr/local/cuda-11.6/lib64:

$ ls -alh "${LIBTORCH_PATH}"
ls: cannot access 'No CUDA runtime is found, using CUDA_HOME='\''/usr/local/cuda-11.6'\'''$'\n''/opt/hostedtoolcache/Python/3.7.13/x64/lib/python3.7/site-packages/torch/lib': No such file or directory
```

This PR prints messages in `torch.utils.cpp_extension` to `stderr`, which allows users to get correct result using `VAR="$(python3 -c '...')"`

### Issue
<!-- Link to Issue ticket or RFP -->

N/A

### Testing
<!-- How did you test your change? -->

N/A
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82097
Approved by: https://github.com/ezyang
2022-07-25 21:55:15 +00:00
Nikita Shulga
95c148e502 [BE] Turn _check_cuda_version into a function (#81603)
It was class method, but does not use any of the class properties/called other class methods
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81603
Approved by: https://github.com/ezyang
2022-07-17 05:49:39 +00:00
Nikita Shulga
7e274964d3 [BE] Disamntle pyramid of doom in _check_cuda_version (#81602)
Replace `if stmt: doSmth; else: raise_or_return` with `if not stmt: raise_or_return; doSmth`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81602
Approved by: https://github.com/ezyang
2022-07-17 05:49:39 +00:00
Jithun Nair
71ee384924 [ROCm] Use torch._C._cuda_getArchFlags to get list of gfx archs pytorch was built for (#80498)
*even if no GPUs are available*

When building PyTorch extensions for ROCm Pytorch, if the user doesn't specify a list of archs using PYTORCH_ROCM_ARCH env var, we would like to use the list of gfx archs that PyTorch was built for as the default value. To do this successfully even in an environment where no GPUs are available eg. a build-only CPU node, we need to be able to get the list of archs. `torch.cuda.get_arch_list()` doesn't work here because it calls `torch.cuda.available()` first: 0922cc024e/torch/cuda/__init__.py (L463), which will return `False` if no GPUs are available, resulting in an empty list being returned by `torch.cuda.get_arch_list()`. To get around this issue, we call the underlying API `torch._C._cuda_getArchFlags()`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80498
Approved by: https://github.com/ezyang, https://github.com/malfet
2022-07-07 16:06:12 +00:00
PyTorch MergeBot
ec4be38ba9 Revert "To add hipify_torch as a submodule in pytorch/third_party (#74704)"
This reverts commit 93b0fec39d.

Reverted https://github.com/pytorch/pytorch/pull/74704 on behalf of https://github.com/malfet due to broke torchvision
2022-06-21 23:54:00 +00:00
Bhavya Medishetty
93b0fec39d To add hipify_torch as a submodule in pytorch/third_party (#74704)
`hipify_torch` as a submodule in `pytorch/third_party`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74704
Approved by: https://github.com/jeffdaily, https://github.com/malfet
2022-06-21 18:56:49 +00:00
Xiao Wang
ef0332e36d Allow relocatable device code linking in pytorch CUDA extensions (#78225)
Close https://github.com/pytorch/pytorch/issues/57543

Doc: check `Relocatable device code linking:` in https://docs-preview.pytorch.org/78225/cpp_extension.html#torch.utils.cpp_extension.CUDAExtension
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78225
Approved by: https://github.com/ezyang, https://github.com/malfet
2022-06-02 21:35:56 +00:00
Michael Suo
fb0f285638 [lint] upgrade mypy to latest version
Fixes https://github.com/pytorch/pytorch/issues/75927.

Had to fix some bugs and add some ignores.

To check if clean:
```
lintrunner --paths-cmd='git grep -Il .' --take MYPY,MYPYSTRICT
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76753

Approved by: https://github.com/malfet
2022-05-03 20:51:34 +00:00
PyTorch MergeBot
3d7428d9ac Revert "[lint] upgrade mypy to latest version"
This reverts commit 9bf18aab94.

Reverted https://github.com/pytorch/pytorch/pull/76753 on behalf of https://github.com/suo
2022-05-03 20:01:18 +00:00
Michael Suo
9bf18aab94 [lint] upgrade mypy to latest version
Fixes https://github.com/pytorch/pytorch/issues/75927.

Had to fix some bugs and add some ignores.

To check if clean:
```
lintrunner --paths-cmd='git grep -Il .' --take MYPY,MYPYSTRICT
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76753

Approved by: https://github.com/malfet
2022-05-03 19:43:28 +00:00
rraminen
7422ccea8b Hipify fixes for a successful DeepSpeed build
These commits are required to build DeepSpeed on ROCm without the hipify errors.

a41829d9ed
663c718462

cc: @jeffdaily

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76141
Approved by: https://github.com/jeffdaily, https://github.com/pruthvistony, https://github.com/albanD
2022-04-28 13:19:59 +00:00
Min Si
9562aedb58 ROCm: add HIP_HOME/include,lib in cpp_extensions (#75548)
Summary:
hip/hip_runtime.h and libamdhip64.so may be required to compile
extension such as torch_ucc. They are in $ROCM_HOME/hip by default,
and may not be symlinked to $ROCM_HOME/include and $ROCM_HOME/lib.
This commit defines $ROCM_HOME/hip as $HIP_HOME, and adds its include
and lib paths when building hipified extension.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75548

Test Plan:
## Verify OSS pytorch + TorchUCC on an AMD GPU machine (MI100)
- step 1. Install OSS pytorch
```
export ROCM_PATH=/opt/rocm-4.5.2
git clone https://github.com/pytorch/pytorch.git
cd pytorch
python3 tools/amd_build/build_amd.py

USE_NCCL=0 USE_RCCL=0 USE_KINETO=0 with-proxy python3 setup.py develop
USE_NCCL=0 USE_RCCL=0 USE_KINETO=0 with-proxy python3 setup.py install
```

- step2. Install torchUCC extension
```
# /opt/rocm-4.5.2/include/hip does not exist, need include /opt/rocm-4.5.2/hip/include at compile time
export ROCM_PATH=/opt/rocm-4.5.2
export RCCL_INSTALL_DIR=/opt/rccl-rocm-rel-4.4-rdc
git clone https://github.com/facebookresearch/torch_ucc.git
cd torch_ucc
UCX_HOME=$RCCL_INSTALL_DIR UCC_HOME=$RCCL_INSTALL_DIR WITH_CUDA=$ROCM_PATH python setup.py
```
Build log before fix (error "hip/hip_runtime.h: No such file or directory"): P493038915
Build log after fix: P493037572

Reviewed By: ezyang

Differential Revision: D35506098

Pulled By: minsii

fbshipit-source-id: 76cbb6d4eaa6549a00898c9d9ebaca47a55330e9
(cherry picked from commit d684c080edf1fbd293e3321151976812c1da8533)
2022-04-19 20:51:37 +00:00
provefar
7a243ddd19 Add import to importlib.abc
Fixes #70525

```
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-3-334d309cf512> in <module>
----> 1 lltm_cpp = load(name="lltm_cpp", sources=["lltm.cpp"])

/usr/lib/python3.10/site-packages/torch/utils/cpp_extension.py in load(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_python_module, is_standalone, keep_intermediates)
   1122                 verbose=True)
   1123     '''
-> 1124     return _jit_compile(
   1125         name,
   1126         [sources] if isinstance(sources, str) else sources,

/usr/lib/python3.10/site-packages/torch/utils/cpp_extension.py in _jit_compile(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_python_module, is_standalone, keep_intermediates)
   1360         return _get_exec_path(name, build_directory)
   1361
-> 1362     return _import_module_from_library(name, build_directory, is_python_module)
   1363
   1364

/usr/lib/python3.10/site-packages/torch/utils/cpp_extension.py in _import_module_from_library(module_name, path, is_python_module)
   1751         spec = importlib.util.spec_from_file_location(module_name, filepath)
   1752         module = importlib.util.module_from_spec(spec)
-> 1753         assert isinstance(spec.loader, importlib.abc.Loader)
   1754         spec.loader.exec_module(module)
   1755         return module

AttributeError: module 'importlib' has no attribute 'abc'
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75736
Approved by: https://github.com/ezyang
2022-04-14 03:32:30 +00:00
Edgar Andrés Margffoy Tuay
86deecd7be Check clang++/g++ version when compiling CUDA extensions (#63230)
Summary:
See https://github.com/pytorch/pytorch/issues/55267

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63230

Reviewed By: soulitzer

Differential Revision: D34159119

Pulled By: malfet

fbshipit-source-id: 6eef7582388bf6a42dcc1d82b6e4b1f40f418dd7
(cherry picked from commit 2056d0a0be7951602de22f8d3b4efc28dd71b6c2)
2022-02-24 08:32:32 +00:00
Andrey Talman
46f9e16afe Documenting cuda 11.5 windows issue (#73013)
Summary:
Adding documentation about compiling extension with CUDA 11.5 and Windows

Example of failure: https://github.com/pytorch/pytorch/runs/4408796098?check_suite_focus=true

 Note: Don't use torch/extension.h In CUDA 11.5 under windows in your C++ code:
    Use aten instead of torch interface in all cuda 11.5 code under windows. It has been failing with errors, due to a bug in nvcc.
    Example use:
        >>> #include <ATen/ATen.h>
        >>> at::Tensor SigmoidAlphaBlendForwardCuda(....)
    Instead of:
        >>> #include <torch/extension.h>
        >>> torch::Tensor SigmoidAlphaBlendForwardCuda(...)
    Currently open issue for nvcc bug: https://github.com/pytorch/pytorch/issues/69460
    Complete Workaround code example: cb170ac024

Pull Request resolved: https://github.com/pytorch/pytorch/pull/73013

Reviewed By: malfet, seemethere

Differential Revision: D34306134

Pulled By: atalman

fbshipit-source-id: 3c5b9d7a89c91bd1920dc63dbd356e45dc48a8bd
(cherry picked from commit 87098e7f17)
2022-02-19 02:34:59 +00:00
Jithun Nair
8dfdc3df82 [ROCm] Refactor how to specify AMD gpu targets using PYTORCH_ROCM_ARCH (#61706)
Summary:
Remove all hardcoded AMD gfx targets

PyTorch build and Magma build will use rocm_agent_enumerator as
backup if PYTORCH_ROCM_ARCH env var is not defined

PyTorch extensions will use same gfx targets as the PyTorch build,
unless PYTORCH_ROCM_ARCH env var is defined

torch.cuda.get_arch_list() now works for ROCm builds

PyTorch CI dockers will continue to be built for gfx900 and gfx906 for now.

PYTORCH_ROCM_ARCH env var can be a space or semicolon separated list of gfx archs eg. "gfx900 gfx906" or "gfx900;gfx906"
cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61706

Reviewed By: seemethere

Differential Revision: D32735862

Pulled By: malfet

fbshipit-source-id: 3170e445e738e3ce373203e1e4ae99c84e645d7d
2021-12-13 15:41:40 -08:00
Nikita Shulga
bede18b061 Add support for C++ frontend wrapper on Linux (#69094)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69094

Partially addresses https://github.com/pytorch/pytorch/issues/68768

Test Plan: Imported from OSS

Reviewed By: seemethere

Differential Revision: D32730079

Pulled By: malfet

fbshipit-source-id: 854e4215ff66e087bdf354fed7a17e87f2649c87
2021-12-02 16:47:00 -08:00
Nikita Shulga
c08e95dd9c Introduce IS_LINUX and IS_MACOS global vars (#69093)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69093

Test Plan: Imported from OSS

Reviewed By: samdow

Differential Revision: D32730080

Pulled By: malfet

fbshipit-source-id: aa3f218d09814b4edd96b01c7b57b85fd58c47fc
2021-12-01 09:47:38 -08:00
Nikita Shulga
f6f1b580f8 Fix mypy in cpp_extension.py (#69101)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69101

Test Plan: Imported from OSS

Reviewed By: atalman, janeyx99

Differential Revision: D32730081

Pulled By: malfet

fbshipit-source-id: 76ace65b51850b74b175a3c4688c05e107873e8d
2021-11-30 16:01:55 -08:00
Jane Xu
78f970568c Add dummy op to use instead of searchsorted (#66964)
Summary:
Would help unblock https://github.com/pytorch/pytorch/issues/66818 if this actually works

Pull Request resolved: https://github.com/pytorch/pytorch/pull/66964

Reviewed By: mruberry

Differential Revision: D31817942

Pulled By: janeyx99

fbshipit-source-id: 9e9a2bcb0c0479ec7000ab8760a2e64bf0e85e95
2021-10-21 12:56:22 -07:00
Pruthvi Madugundu
085e2f7bdd [ROCm] Changes not to rely on CUDA_VERSION or HIP_VERSION (#65610)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65610

- Replace HIP_PLATFORM_HCC with USE_ROCM
- Dont rely on CUDA_VERSION or HIP_VERSION and use USE_ROCM and ROCM_VERSION.

- In the next PR
   - Will be removing the mapping from CUDA_VERSION to HIP_VERSION and CUDA to HIP in hipify.
   - HIP_PLATFORM_HCC is deprecated, so will add HIP_PLATFORM_AMD to support HIP host code compilation on gcc.

cc jeffdaily sunway513 jithunnair-amd ROCmSupport amathews-amd

Reviewed By: jbschlosser

Differential Revision: D30909053

Pulled By: ezyang

fbshipit-source-id: 224a966ebf1aaec79beccbbd686fdf3d49267e06
2021-09-29 09:55:43 -07:00
peterjc123
e6dc7bc61b Subprocess encoding fixes for cpp extension (#63756)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/63584

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63756

Reviewed By: bdhirsh

Differential Revision: D30485046

Pulled By: ezyang

fbshipit-source-id: 4f0ac383da4e8843e2a602dceae85f389d7434ee
2021-08-24 10:46:11 -07:00
Nikita Shulga
9679fa7f30 Update cpp_extension.py (#61484)
Summary:
By default, majority of Python-3.[6789] installation comes with `pkg_resources.packaging` version 16.8 (or `setuptool` older than 49.6.0), which does not have major/minor properties on Version package, as one can observe in https://github.com/pypa/setuptools/blob/v49.5.0/pkg_resources/_vendor/packaging/version.py
On the other hand, compare operators exists, so why not use it to check for version equality

Fixes https://github.com/pytorch/pytorch/issues/61036

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61484

Reviewed By: walterddr, seemethere

Differential Revision: D29643883

Pulled By: malfet

fbshipit-source-id: 3db9168c1b009ac3a278709083ea8c5b417471b8
2021-07-13 07:11:58 -07:00
Rong Rong (AI Infra)
2d0c6e60a7 going back to use packaging.version.parse instead (#61053)
Summary:
I think this may be related to https://app.circleci.com/pipelines/github/pytorch/vision/9352/workflows/9c8afb1c-6157-4c82-a5c8-105c5adac57d/jobs/687003

Apparently `pkg_resource.parse_version` returns a type of `pkg_resources.extern.packaging.version.Version` instead of `packaging.version.Version` and seems on some older version of the setuptools it doesn't support `.major/minor` operation. changing it back to using `packaging.version.parse`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61053

Test Plan: CI

Reviewed By: samestep

Differential Revision: D29494322

Pulled By: walterddr

fbshipit-source-id: 294572a10b167677440d7404e5ebe007ab59d299
2021-06-30 16:23:59 -07:00
Edgar Andrés Margffoy Tuay
d46eb77b04 Improve CUDA extension building error/warning messages (#59665)
Summary:
See https://github.com/pytorch/pytorch/issues/55267

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59665

Reviewed By: mruberry

Differential Revision: D29462248

Pulled By: ezyang

fbshipit-source-id: 9de13a284a14a7cd24200b9684151ce652e1eb1e
2021-06-29 13:03:30 -07:00
Edgar Andrés Margffoy Tuay
6322f66878 Add python version and cuda-specific folder to store extensions (#60592)
Summary:
See https://github.com/pytorch/pytorch/issues/55267

Pull Request resolved: https://github.com/pytorch/pytorch/pull/60592

Reviewed By: albanD

Differential Revision: D29353368

Pulled By: ezyang

fbshipit-source-id: 1fbcd021f1030132c0f950f33ce4a3a2fef351e0
2021-06-25 10:27:04 -07:00
albanD
0a0e024648 use importlib instead of imp as it support python 3.5+ (#57160)
Summary:
Prevent some annoying deprecation warning when importing cpp_extensions

Pull Request resolved: https://github.com/pytorch/pytorch/pull/57160

Reviewed By: astaff

Differential Revision: D28096751

Pulled By: albanD

fbshipit-source-id: f169ad4c4945b0fff54c0339052a29f95b9f1831
2021-05-03 05:56:25 -07:00
davidriazati@fb.com
4b96fc060b Remove distutils (#57040)
Summary:
[distutils](https://docs.python.org/3/library/distutils.html) is on its way out and will be deprecated-on-import for Python 3.10+ and removed in Python 3.12 (see [PEP 632](https://www.python.org/dev/peps/pep-0632/)). There's no reason for us to keep it around since all the functionality we want from it can be found in `setuptools` / `sysconfig`. `setuptools` includes a copy of most of `distutils` (which is fine to use according to the PEP), that it uses under the hood, so this PR also uses that in some places.

Fixes #56527
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57040

Pulled By: driazati

Reviewed By: nikithamalgifb

Differential Revision: D28051356

fbshipit-source-id: 1ca312219032540e755593e50da0c9e23c62d720
2021-04-29 12:10:11 -07:00
Sam Estep
75024e228c Add lint for unqualified type: ignore (#56290)
Summary:
The other half of https://github.com/pytorch/pytorch/issues/56272.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/56290

Test Plan:
CI should pass on the tip of this PR, and we know that the lint works because the following CI runs (before this PR was finished) failed:

- https://github.com/pytorch/pytorch/runs/2384511062
- https://github.com/pytorch/pytorch/actions/runs/765036024

Reviewed By: seemethere

Differential Revision: D27867219

Pulled By: samestep

fbshipit-source-id: e648f07b6822867e70833e23ddafe7fb7eaca235
2021-04-21 08:07:23 -07:00
Pruthvi Madugundu
b383b63550 [ROCm] Updating ROCM_HOME handling for >ROCm 4.0 (#55968)
Summary:
- This change is required to handle the case when hipcc is
  updated to the latest using update-alternatives.
- Update-alternatives support for few ROCm binaries is available
  from ROCm 4.1 onwards.
- This change doesnt not affect any previous versions of ROCm.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/55968

Reviewed By: mruberry

Differential Revision: D27785123

Pulled By: ezyang

fbshipit-source-id: 8467e468d8d51277fab9b0c8cbd57e80bbcfc7f7
2021-04-15 07:48:36 -07:00
Aleksei Kashapov
0b8bd22614 Fix bug with rebuilding extensions every import (#56015)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56015

Reviewed By: mruberry

Differential Revision: D27765934

Pulled By: ezyang

fbshipit-source-id: 65cace951fce5f2284ab91d8bd687ac89a2311fb
2021-04-14 13:25:01 -07:00
Jeff Daily
e5b97777e3 [ROCm] allow PYTORCH_ROCM_ARCH in cpp_extension.py (#54341)
Summary:
Allows extensions to override ROCm gfx arch targets.  Reuses the same env var used during cmake build for consistency.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/54341

Reviewed By: bdhirsh

Differential Revision: D27244010

Pulled By: heitorschueroff

fbshipit-source-id: 279e1a41ee395a0596aa7f696b6e908cf7f5bb83
2021-03-23 13:06:00 -07:00
Colin Gravill
65087dd1d4 Fix broken link from load_inline to new test location (#53701)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53701

Reviewed By: jbschlosser

Differential Revision: D27047406

Pulled By: ezyang

fbshipit-source-id: 0be6e669cf41527d3ffeb101e5f36db07e41b4af
2021-03-15 13:53:15 -07:00
peterjc123
44ff79d849 Automatically set BUILD_SPLIT_CUDA for cpp exts (#52503)
Summary:
Fixes https://github.com/pytorch/vision/pull/3418#issuecomment-781673110

Pull Request resolved: https://github.com/pytorch/pytorch/pull/52503

Reviewed By: malfet

Differential Revision: D26546857

Pulled By: janeyx99

fbshipit-source-id: a100b408e7cd28695145a1dda7f2fa081bb7f21f
2021-02-19 12:22:55 -08:00
Jane Xu
550c965b2e Re-enable test_standalone_load for Windows 11.1 (#51596)
Summary:
This fixes the previous erroring out by adding stricter conditions in cpp_extension.py.

To test, run a split torch_cuda build on Windows with export BUILD_SPLIT_CUDA=ON && python setup.py develop and then run the following test: python test/test_utils.py TestStandaloneCPPJIT.test_load_standalone. It should pass.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/51596

Reviewed By: malfet

Differential Revision: D26213816

Pulled By: janeyx99

fbshipit-source-id: a752ce7f9ab9d73dcf56f952bed2f2e040614443
2021-02-03 08:58:34 -08:00
Nikita Shulga
f7313b3105 Fix Python.h discovery logic on some MacOS platforms (#51586)
Summary:
On all non-Windows platforms we should use 'posix_prefix' schema to discover location of Python.h header

Pull Request resolved: https://github.com/pytorch/pytorch/pull/51586

Reviewed By: ezyang

Differential Revision: D26208684

Pulled By: malfet

fbshipit-source-id: bafa6d79de42231629960c642d535f1fcf7a427f
2021-02-02 21:38:37 -08:00
Jane Xu
88af2149e1 Add build option to split torch_cuda library into torch_cuda_cu and torch_cuda_cpp (#49050)
Summary:
Because of the size of our `libtorch_cuda.so`, linking with other hefty binaries presents a problem where 32bit relocation markers are too small and end up overflowing. This PR attempts to break up `torch_cuda` into `torch_cuda_cu` and `torch_cuda_cpp`.

`torch_cuda_cu`: all the files previously in `Caffe2_GPU_SRCS` that are
* pure `.cu` files in `aten`match
* all the BLAS files
* all the THC files, except for THCAllocator.cpp, THCCachingHostAllocator.cpp and THCGeneral.cpp
* all files in`detail`
* LegacyDefinitions.cpp and LegacyTHFunctionsCUDA.cpp
* Register*CUDA.cpp
* CUDAHooks.cpp
* CUDASolver.cpp
* TensorShapeCUDA.cpp

`torch_cuda_cpp`: all other files in `Caffe2_GPU_SRCS`

Accordingly, TORCH_CUDA_API and TORCH_CUDA_BUILD_MAIN_LIB usages are getting split as well to TORCH_CUDA_CU_API and TORCH_CUDA_CPP_API.

To test this locally, you can run `export BUILD_SPLIT_CUDA=ON && python setup.py develop`. In your `build/lib` folder, you should find binaries for both `torch_cuda_cpp` and `torch_cuda_cu`. To see that the SPLIT_CUDA option was toggled, you can grep the Summary of running cmake and make sure `Split CUDA` is ON.

This build option is tested on CI for CUDA 11.1 builds (linux for now, but windows soon).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49050

Reviewed By: walterddr

Differential Revision: D26114310

Pulled By: janeyx99

fbshipit-source-id: 0180f2519abb5a9cdde16a6fb7dd3171cff687a6
2021-02-01 18:42:35 -08:00
Jithun Nair
327539ca79 Fix bug in hipify if include_dirs is not specified in setup.py (#50703)
Summary:
Bugs:
1) would introduce -I* in compile commands
2) wouldn't hipify source code directly in build_dir, only one level down or more

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50703

Reviewed By: mrshenli

Differential Revision: D25949070

Pulled By: ngimel

fbshipit-source-id: 018c2a056b68019a922e20e5db2eb8435ad147fe
2021-01-19 16:30:17 -08:00
Ralf Gommers
e29082b2a6 Run mypy over test/test_utils.py (#50278)
Summary:
_resubmission of gh-49654, which was reverted due to a cross-merge conflict_

This caught one incorrect annotation in `cpp_extension.load`.

xref gh-16574.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50278

Reviewed By: walterddr

Differential Revision: D25865278

Pulled By: ezyang

fbshipit-source-id: 25489191628af5cf9468136db36f5a0f72d9d54d
2021-01-11 08:16:23 -08:00
Rong Rong (AI Infra)
e3c56ddde6 Revert D25757691: [pytorch][PR] Run mypy over test/test_utils.py
Test Plan: revert-hammer

Differential Revision:
D25757691 (c86cfcd81d)

Original commit changeset: 145ce3ae532c

fbshipit-source-id: 3dfd68f0c42fc074cde15c6213a630b16e9d8879
2021-01-05 13:40:13 -08:00
Ralf Gommers
c86cfcd81d Run mypy over test/test_utils.py (#49654)
Summary:
This caught one incorrect annotation in `cpp_extension.load`.

xref gh-16574.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49654

Reviewed By: heitorschueroff

Differential Revision: D25757691

Pulled By: ezyang

fbshipit-source-id: 145ce3ae532cc585d9ca3bbd5381401bad0072e2
2021-01-05 09:32:06 -08:00
Samuel Marks
e6779d4357 [*.py] Rename "Arguments:" to "Args:" (#49736)
Summary:
I've written custom parsers and emitters for everything from docstrings to classes and functions. However, I recently came across an issue when I was parsing/generating from the TensorFlow codebase: inconsistent use of `Args:` and `Arguments:` in its docstrings.

```sh
(pytorch#c348fae)$ for name in 'Args:' 'Arguments:'; do
    printf '%-10s %04d\n' "$name" "$(rg -IFtpy --count-matches "$name" | paste -s -d+ -- | bc)"; done
Args:      1095
Arguments: 0336
```

It is easy enough to extend my parsers to support both variants, however it looks like `Arguments:` is wrong anyway, as per:

  - https://google.github.io/styleguide/pyguide.html#doc-function-args @ [`ddccc0f`](https://github.com/google/styleguide/blob/ddccc0f/pyguide.md)

  - https://chromium.googlesource.com/chromiumos/docs/+/master/styleguide/python.md#describing-arguments-in-docstrings @ [`9fc0fc0`](https://chromium.googlesource.com/chromiumos/docs/+/9fc0fc0/styleguide/python.md)

  - https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html @ [`c0ae8e3`](https://github.com/sphinx-contrib/napoleon/blob/c0ae8e3/docs/source/example_google.rst)

Therefore, only `Args:` is valid. This PR replaces them throughout the codebase.

PS: For related PRs, see tensorflow/tensorflow/pull/45420

PPS: The trackbacks automatically appearing below are sending the same changes to other repositories in the [PyTorch](https://github.com/pytorch) organisation.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49736

Reviewed By: albanD

Differential Revision: D25710534

Pulled By: soumith

fbshipit-source-id: 61e8ff01abb433e9f78185c2d1d0cbd7c22c1619
2020-12-28 09:34:47 -08:00
Stas Bekman
60b4c40101 [extensions] fix is_ninja_available during cuda extension building (#49443)
Summary:
tldr: current version of `is_ninja_available` of `torch/utils/cpp_extension.py` fails to run in the recent incarnations of pip w/ new build isolation feature which is now a default. This PR fixes this problem.

The full story follows:

--------------------------

Currently trying to build https://github.com/facebookresearch/fairscale/ which builds cuda extensions fails with the recent pip versions. The build is failing to perform `is_ninja_available`, which runs a simple subprocess to run `ninja --version` but does it with some /dev/null stream override which seems to break with the new pip versions. Currently I have `pip==20.3.3`. The recent pip performs build isolation which first fetches all dependencies to somewhere under /tmp/pip-install-xyz and then builds the package.

If I build:

```
pip install fairscale --no-build-isolation
```
everything works.

When building normally (i.e. without `--no-build-isolation`), the failure is a long long trace,
<details>
<summary>Full log</summary>
<pre>
pip install fairscale
Collecting fairscale
  Downloading fairscale-0.1.1.tar.gz (83 kB)
     |████████████████████████████████| 83 kB 562 kB/s
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  ERROR: Command errored out with exit status 1:
   command: /home/stas/anaconda3/envs/main-38/bin/python /home/stas/anaconda3/envs/main-38/lib/python3.8/site-packages/pip/_vendor/pep517/_in_process.py get_requires_for_build_wheel /tmp/tmpjvw00c7v
       cwd: /tmp/pip-install-1wq9f8fp/fairscale_347f218384a64f24b8d5ce846641213e
  Complete output (55 lines):
  running egg_info
  writing fairscale.egg-info/PKG-INFO
  writing dependency_links to fairscale.egg-info/dependency_links.txt
  writing requirements to fairscale.egg-info/requires.txt
  writing top-level names to fairscale.egg-info/top_level.txt
  Traceback (most recent call last):
    File "/home/stas/anaconda3/envs/main-38/bin/ninja", line 5, in <module>
      from ninja import ninja
  ModuleNotFoundError: No module named 'ninja'
  Traceback (most recent call last):
    File "/home/stas/anaconda3/envs/main-38/lib/python3.8/site-packages/pip/_vendor/pep517/_in_process.py", line 280, in <module>
      main()
    File "/home/stas/anaconda3/envs/main-38/lib/python3.8/site-packages/pip/_vendor/pep517/_in_process.py", line 263, in main
      json_out['return_val'] = hook(**hook_input['kwargs'])
    File "/home/stas/anaconda3/envs/main-38/lib/python3.8/site-packages/pip/_vendor/pep517/_in_process.py", line 114, in get_requires_for_build_wheel
      return hook(config_settings)
    File "/tmp/pip-build-env-a5x2icen/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 149, in get_requires_for_build_wheel
      return self._get_build_requires(
    File "/tmp/pip-build-env-a5x2icen/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 130, in _get_build_requires
      self.run_setup()
    File "/tmp/pip-build-env-a5x2icen/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 145, in run_setup
      exec(compile(code, __file__, 'exec'), locals())
    File "setup.py", line 56, in <module>
      setuptools.setup(
    File "/tmp/pip-build-env-a5x2icen/overlay/lib/python3.8/site-packages/setuptools/__init__.py", line 153, in setup
      return distutils.core.setup(**attrs)
    File "/home/stas/anaconda3/envs/main-38/lib/python3.8/distutils/core.py", line 148, in setup
      dist.run_commands()
    File "/home/stas/anaconda3/envs/main-38/lib/python3.8/distutils/dist.py", line 966, in run_commands
      self.run_command(cmd)
    File "/home/stas/anaconda3/envs/main-38/lib/python3.8/distutils/dist.py", line 985, in run_command
      cmd_obj.run()
    File "/tmp/pip-build-env-a5x2icen/overlay/lib/python3.8/site-packages/setuptools/command/egg_info.py", line 298, in run
      self.find_sources()
    File "/tmp/pip-build-env-a5x2icen/overlay/lib/python3.8/site-packages/setuptools/command/egg_info.py", line 305, in find_sources
      mm.run()
    File "/tmp/pip-build-env-a5x2icen/overlay/lib/python3.8/site-packages/setuptools/command/egg_info.py", line 536, in run
      self.add_defaults()
    File "/tmp/pip-build-env-a5x2icen/overlay/lib/python3.8/site-packages/setuptools/command/egg_info.py", line 572, in add_defaults
      sdist.add_defaults(self)
    File "/home/stas/anaconda3/envs/main-38/lib/python3.8/distutils/command/sdist.py", line 228, in add_defaults
      self._add_defaults_ext()
    File "/home/stas/anaconda3/envs/main-38/lib/python3.8/distutils/command/sdist.py", line 311, in _add_defaults_ext
      build_ext = self.get_finalized_command('build_ext')
    File "/home/stas/anaconda3/envs/main-38/lib/python3.8/distutils/cmd.py", line 298, in get_finalized_command
      cmd_obj = self.distribution.get_command_obj(command, create)
    File "/home/stas/anaconda3/envs/main-38/lib/python3.8/distutils/dist.py", line 858, in get_command_obj
      cmd_obj = self.command_obj[command] = klass(self)
    File "/tmp/pip-build-env-a5x2icen/overlay/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 351, in __init__
      if not is_ninja_available():
    File "/tmp/pip-build-env-a5x2icen/overlay/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1310, in is_ninja_available
      subprocess.check_call('ninja --version'.split(), stdout=devnull)
    File "/home/stas/anaconda3/envs/main-38/lib/python3.8/subprocess.py", line 364, in check_call
      raise CalledProcessError(retcode, cmd)
  subprocess.CalledProcessError: Command '['ninja', '--version']' returned non-zero exit status 1.
  ----------------------------------------
ERROR: Command errored out with exit status 1: /home/stas/anaconda3/envs/main-38/bin/python /home/stas/anaconda3/envs/main-38/lib/python3.8/site-packages/pip/_vendor/pep517/_in_process.py get_requires_for_build_wheel /tmp/tmpjvw00c7v Check the logs for full command output.
</pre>

</details>

and the middle of it is what we want:

```
    File "/tmp/pip-build-env-a5x2icen/overlay/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 351, in __init__
      if not is_ninja_available():
    File "/tmp/pip-build-env-a5x2icen/overlay/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1310, in is_ninja_available
      subprocess.check_call('ninja --version'.split(), stdout=devnull)
    File "/home/stas/anaconda3/envs/main-38/lib/python3.8/subprocess.py", line 364, in check_call
      raise CalledProcessError(retcode, cmd)
  subprocess.CalledProcessError: Command '['ninja', '--version']' returned non-zero exit status 1.
```

For some reason pytorch fails to run this simple code:

```
# torch/utils/cpp_extension.py
def is_ninja_available():
    r'''
    Returns ``True`` if the `ninja <https://ninja-build.org/>`_ build system is
    available on the system, ``False`` otherwise.
    '''
    with open(os.devnull, 'wb') as devnull:
        try:
            subprocess.check_call('ninja --version'.split(), stdout=devnull)
        except OSError:
            return False
        else:
            return True
```

I suspect that pip does something to `os.devnull` and that's why it fails.

This PR proposes a simpler code which doesn't rely on anything but `subprocess.check_output`:

```
def is_ninja_available():
    r'''
    Returns ``True`` if the `ninja <https://ninja-build.org/>`_ build system is
    available on the system, ``False`` otherwise.
    '''
    try:
        subprocess.check_output('ninja --version'.split())
    except Exception:
        return False
    else:
        return True
```

which doesn't use `os.devnull` and performs the same function. There could be a whole bunch of different exceptions there I think, so I went for the generic one - we don't care why it failed, since this function's only purpose is to suggest whether ninja can be used or not.

Let's check

```
python -c "import torch.utils.cpp_extension; print(torch.utils.cpp_extension.is_ninja_available())"
True
```

Look ma - no std noise to take care of. (i.e. no need for /dev/null).

I was editing the  installed environment-wide `cpp_extension.py` file directly, so didn't need to tweak `PYTHONPATH` - I made sure to replace `'ninja --version'.` with something that should fail and I did get `False` for the above command line.

I next did a somewhat elaborate cheat to re-package an already existing binary wheel with this corrected version of `cpp_extension.py`, rather than building from source:
```
mkdir /tmp/pytorch-local-channel
cd /tmp/pytorch-local-channel

# get the latest nightly wheel
wget https://download.pytorch.org/whl/nightly/cu110/torch-1.8.0.dev20201215%2Bcu110-cp38-cp38-linux_x86_64.whl

# unpack it
unzip torch-1.8.0.dev20201215+cu110-cp38-cp38-linux_x86_64.whl

# edit torch/utils/cpp_extension.py to fix the python code with the new version as in this PR
emacs torch/utils/cpp_extension.py &

# pack the files back
zip -r torch-1.8.0.dev20201215+cu110-cp38-cp38-linux_x86_64.whl caffe2 torch torch-1.8.0.dev20201215+cu110.dist-info
```

Now I tell pip to use my local channel, plus `--pre` for it to pick up the pre-release as an acceptable wheel
```
# install using this local channel
git clone https://github.com/facebookresearch/fairscale/
cd fairscale
pip install -v --disable-pip-version-check -e . -f file:///tmp/pytorch-local-channel --pre
```
and voila all works.

```
[...]
Successfully installed fairscale
```

I noticed a whole bunch of ninja not found errors in the log, which I think is the same problem with other parts of the build system packages which also use this old check copied all over various projects and build tools, and which the recent pip breaks.

```
    writing manifest file '/tmp/pip-modern-metadata-_nsdesbq/fairscale.egg-info/SOURCES.txt'
    Traceback (most recent call last):
      File "/home/stas/anaconda3/envs/main-38/bin/ninja", line 5, in <module>
        from ninja import ninja
    ModuleNotFoundError: No module named 'ninja'
    [...]
    /tmp/pip-build-env-fqflyevr/overlay/lib/python3.8/site-packages/torch/utils/cpp_extension.py:364: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
      warnings.warn(msg.format('we could not find ninja.'))
```

but these don't prevent from the build completing and installing.

I suppose these need to be identified and reported to various other projects, but that's another story.

The new pip does something to `os.devnull` I think which breaks any code relying on it - I haven't tried to figure out what happens to that stream object, but this PR which removes its usage solves the problem.

Also do notice that:

```
git clone https://github.com/facebookresearch/fairscale/
cd fairscale
python setup.py bdist_wheel
pip install dist/fairscale-0.1.1-cp38-cp38-linux_x86_64.whl
```
works too. So it is really a pip issue.

Apologies if the notes are too many, I tried to give the complete picture and probably other projects will need those details as well.

Thank you for reading.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49443

Reviewed By: mruberry

Differential Revision: D25592109

Pulled By: ezyang

fbshipit-source-id: bfce4420c28b614ead48e9686f4153c6e0fbe8b7
2020-12-16 18:02:11 -08:00
Gao, Xiang
d409da0677 Fix CUDA extension ninja build (#49344)
Summary:
I am submitting this PR on behalf of Janne Hellsten(nurpax) from NVIDIA, for the convenience of CLA. Thanks Janne a lot for the contribution!

Currently, the ninja build decides whether to rebuild a .cu file or not pretty randomly. And there are actually two issues:

First, the arch list in the building command is ordered randomly. When the order changes, it will unconditionally rebuild regardless of the timestamp.

Second, the header files are not included in the dependency list, so if the header file changes, it is possible that ninja will not rebuild.

This PR fixes both issues. The fix for the second issue requires nvcc >= 10.2. nvcc < 10.2 can still build CUDA extension as it used to be, but it will be unable to see the changes in header files.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49344

Reviewed By: glaringlee

Differential Revision: D25540157

Pulled By: ezyang

fbshipit-source-id: 197541690d7f25e3ac5ebe3188beb1f131a4c51f
2020-12-16 17:45:12 -08:00
Stas Bekman
02b63858f2 [CUDAExtension] support all visible cards when building a cudaextension (#48891)
Summary:
Currently CUDAExtension assumes that all cards are of the same type on the same machine and builds the extension with compute capability of the 0th card. This breaks later at runtime if the machine has cards of different types.

Specifically resulting in:
```
RuntimeError: CUDA error: no kernel image is available for execution on the device
```
when the cards of the types that weren't compiled for are used. (and the error is far from telling what the problem is to the uninitiated)

My current setup is:
```
$ CUDA_VISIBLE_DEVICES=0 python -c "import torch; print(torch.cuda.get_device_capability())"
(8, 6)
$ CUDA_VISIBLE_DEVICES=1 python -c "import torch; print(torch.cuda.get_device_capability())"
(6, 1)
```
but the extension was getting built with `-gencode=arch=compute_80,code=sm_80`.

This PR:
* [x] introduces a loop over all visible at build time devices to ensure the extension will run on all of them (it sorts the new list generated by the loop, so that the output is easier to debug should a card with lower capacity come last)
* [x] adds `+PTX` to the last entry of ccs derived from local cards (`if not _arch_list:`) to support other archs
* [x] adds a digest of my conversation with ptrblck on slack in the form of docs which hopefully can help others know which archs to support, how to override defaults, when and how to add PTX, etc.

Please kindly review that my prose is clear and easy to understand.

ptrblck

Pull Request resolved: https://github.com/pytorch/pytorch/pull/48891

Reviewed By: ngimel

Differential Revision: D25358285

Pulled By: ezyang

fbshipit-source-id: 8160f3adebffbc8e592ddfcc3adf153a9dc91557
2020-12-08 14:57:10 -08:00
Jithun Nair
5f62308739 Hipify revamp [REDUX] (#48715)
Summary:
[Refiled version of earlier PR https://github.com/pytorch/pytorch/issues/45451]

This PR revamps the hipify module in PyTorch to overcome a long list of shortcomings in the original implementation. However, these improvements are applied only when using hipify to build PyTorch extensions, not for PyTorch or Caffe2 itself.

Correspondingly, changes are made to cpp_extension.py to match these improvements.

The list of improvements to hipify is as follows:

1. Hipify files in the same directory as the original file, unless there's a "cuda" subdirectory in the original file path, in which case the hipified file will be in the corresponding file path with "hip" subdirectory instead of "cuda".
2. Never hipify the file in-place if changes are introduced due to hipification i.e. always ensure the hipified file either resides in a different folder or has a different filename compared to the original file.
3. Prevent re-hipification of already hipified files. This avoids creation of unnecessary "hip/hip" etc. subdirectories and additional files which have no actual use.
4. Do not write out hipified versions of files if they are identical to the original file. This results in a cleaner output directory, with minimal number of hipified files created.
5. Update header rewrite logic so that it accounts for the previous improvement.
6. Update header rewrite logic so it respects the rules for finding header files depending on whether "" or <> is used.
7. Return a dictionary of mappings of original file paths to hipified file paths from hipify function.
8. Introduce a version for hipify module to allow extensions to contain back-compatible code that targets a specific point in PyTorch where the hipify functionality changed.
9. Update cuda_to_hip_mappings.py to account for the ROCm component subdirectories inside /opt/rocm/include. This also results in cleanup of the Caffe2_HIP_INCLUDE path to remove unnecessary additions to the include path.

The list of changes to cpp_extension.py is as follows:

1. Call hipify when building a CUDAExtension for ROCm.
2. Prune the list of source files to CUDAExtension to include only the hipified versions of any source files in the list (if both original and hipified versions of the source file are in the list)
3. Add subdirectories of /opt/rocm/include to the include path for extensions, so that ROCm headers for subcomponent libraries are found automatically

cc jeffdaily sunway513 ezyang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/48715

Reviewed By: bdhirsh

Differential Revision: D25272824

Pulled By: ezyang

fbshipit-source-id: 8bba68b27e41ca742781e1c4d7b07c6f985f040e
2020-12-02 18:03:23 -08:00
Eli Uriegas
780f2b9a9b torch: Stop using _nt_quote_args from distutils (#48618)
Summary:
They removed the specific function in Python 3.9 so we should just
remake the function here and use our own instead of relying on hidden
functions from the stdlib

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Fixes https://github.com/pytorch/pytorch/issues/48617

Pull Request resolved: https://github.com/pytorch/pytorch/pull/48618

Reviewed By: samestep

Differential Revision: D25230281

Pulled By: seemethere

fbshipit-source-id: 57216af40a4ae4dc8bafcf40d2eb3ba793b9b6e2
2020-12-02 16:53:25 -08:00
Taylor Robie
022c929145 Revert "Revert D25199264: Enable callgrind collection for C++ snippets" (#48720)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48720

This reverts commit 6646ff122d.

Test Plan: Imported from OSS

Reviewed By: malfet

Differential Revision: D25273994

Pulled By: malfet

fbshipit-source-id: 61743176dc650136622e1b8f2384bbfbd7a46294
2020-12-02 11:10:11 -08:00
Taylor Robie
07f038aa9d Add option for cpp_extensions to compile standalone executable (#47862)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47862

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D25199265

Pulled By: robieta

fbshipit-source-id: eceb04dea60b82eb10434099639fa3afa61000ca
2020-12-01 20:03:08 -08:00
Nikita Shulga
8af9f2cc23 Revert D24924736: [pytorch][PR] Hipify revamp
Test Plan: revert-hammer

Differential Revision:
D24924736 (10b490a3e0)

Original commit changeset: 4af42b8ff4f2

fbshipit-source-id: 7f8f90d55d8a69a2890ec73622fcea559189e381
2020-11-18 11:48:30 -08:00
Jithun Nair
10b490a3e0 Hipify revamp (#45451)
Summary:
This PR revamps the hipify module in PyTorch to overcome a long list of shortcomings in the original implementation. However, these improvements are applied only when using hipify to build PyTorch extensions, **not for PyTorch or Caffe2 itself**.

Correspondingly, changes are made to `cpp_extension.py` to match these improvements.

The list of improvements to hipify is as follows:

1. Hipify files in the same directory as the original file, unless there's a "cuda" subdirectory in the original file path, in which case the hipified file will be in the corresponding file path with "hip" subdirectory instead of "cuda".
2. Never hipify the file in-place if changes are introduced due to hipification i.e. always ensure the hipified file either resides in a different folder or has a different filename compared to the original file.
3. Prevent re-hipification of already hipified files. This avoids creation of unnecessary "hip/hip" etc. subdirectories and additional files which have no actual use.
4. Do not write out hipified versions of files if they are identical to the original file. This results in a cleaner output directory, with minimal number of hipified files created.
5. Update header rewrite logic so that it accounts for the previous improvement.
6. Update header rewrite logic so it respects the rules for finding header files depending on whether `""` or `<>` is used.
7. Return a dictionary of mappings of original file paths to hipified file paths from `hipify` function.
8. Introduce a version for hipify module to allow extensions to contain back-compatible code that targets a specific point in PyTorch where the hipify functionality changed.
9. Update `cuda_to_hip_mappings.py` to account for the ROCm component subdirectories inside `/opt/rocm/include`. This also results in cleanup of the `Caffe2_HIP_INCLUDE` path to remove unnecessary additions to the include path.

The list of changes to `cpp_extension.py` is as follows:
1. Call `hipify` when building a CUDAExtension for ROCm.
2. Prune the list of source files to CUDAExtension to include only the hipified versions of any source files in the list (if both original and hipified versions of the source file are in the list)
3. Add subdirectories of /opt/rocm/include to the include path for extensions, so that ROCm headers for subcomponent libraries are found automatically

cc jeffdaily sunway513 hgaspar lcskrishna ashishfarmer

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45451

Reviewed By: ezyang

Differential Revision: D24924736

Pulled By: malfet

fbshipit-source-id: 4af42b8ff4f21c3782dedb8719b8f9f86b34bd2d
2020-11-18 08:37:49 -08:00
Chester Liu
17a6bc7c1b Cleanup unused code for Python < 3.6 (#47822)
Summary:
I think these can be safely removed since the min version of supported Python is now 3.6

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47822

Reviewed By: smessmer

Differential Revision: D24954936

Pulled By: ezyang

fbshipit-source-id: 5d4b2aeb78fc97d7ee4abaf5fb2aae21bf765e8b
2020-11-13 21:37:01 -08:00
peter
d73a8db2d2 Use local env for building CUDA extensions on Windows (#47150)
Summary:
Fixes https://github.com/pytorch/vision/pull/2818#issuecomment-719167504
After activating the VC env multiple times, the following error will be raised when building a CUDA extension.
```
FAILED: C:/tools/MINICO~1/CONDA-~2/TORCHV~1/work/build/temp.win-amd64-3.8/Release/tools/MINICO~1/CONDA-~2/TORCHV~1/work/torchvision/csrc/cuda/PSROIAlign_cuda.obj
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\bin\nvcc -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -DWITH_CUDA -Dtorchvision_EXPORTS -IC:\tools\MINICO~1\CONDA-~2\TORCHV~1\work\torchvision\csrc -I%PREFIX%\lib\site-packages\torch\include -I%PREFIX%\lib\site-packages\torch\include\torch\csrc\api\include -I%PREFIX%\lib\site-packages\torch\include\TH -I%PREFIX%\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\include" -I%PREFIX%\include -I%PREFIX%\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.27.29110\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.27.29110\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.27.29110\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.27.29110\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" -I%PREFIX%\Library\include -c C:\tools\MINICO~1\CONDA-~2\TORCHV~1\work\torchvision\csrc\cuda\PSROIAlign_cuda.cu -o C:\tools\MINICO~1\CONDA-~2\TORCHV~1\work\build\temp.win-amd64-3.8\Release\tools\MINICO~1\CONDA-~2\TORCHV~1\work\torchvision\csrc\cuda\PSROIAlign_cuda.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_35,code=sm_35 -gencode=arch=compute_50,code=sm_50 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_50,code=compute_50 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
'cl.exe' is not recognized as an internal or external command,
operable program or batch file.
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47150

Reviewed By: agolynski

Differential Revision: D24706019

Pulled By: ezyang

fbshipit-source-id: c13dc29f62d2d12d6a56f33dd450b467a1bf193b
2020-11-10 20:02:06 -08:00
Yuxin Wu
5cba3cec5a fix extensions build flags on newer GPUs (#47585)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/47352

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47585

Reviewed By: heitorschueroff

Differential Revision: D24833654

Pulled By: ezyang

fbshipit-source-id: eaec5b8db5f35cac0a74d2858cb054a3853b0990
2020-11-10 11:38:18 -08:00
Simon Geisler
abae12ba41 only set ccbin flag if not provided by user (#47404)
Summary:
Avoid nvcc error if the user specifies c compiler (as pointed out in https://github.com/pytorch/pytorch/issues/47377)

Fixes https://github.com/pytorch/pytorch/issues/47377

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47404

Reviewed By: ejguan

Differential Revision: D24748833

Pulled By: malfet

fbshipit-source-id: 1a4ad1f851c8854795f7f98e28f479a0ff458a00
2020-11-10 07:55:57 -08:00
Nikita Shulga
2b6a720eb1 Update pybind to 2.6.0 (#46415)
Summary:
Preserve PYBIND11 (63ce3fbde8) configuration options in `torch._C._PYBIND11 (63ce3fbde8)_COMPILER_TYPE` and use them when building extensions

Also, use f-strings in `torch.utils.cpp_extension`

"Fixes" https://github.com/pytorch/pytorch/issues/46367

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46415

Reviewed By: VitalyFedyunin

Differential Revision: D24605949

Pulled By: malfet

fbshipit-source-id: 87340f2ed5308266a46ef8f0317316227dab9d4d
2020-10-29 10:53:47 -07:00
Nikita Shulga
42a51148c1 Use f-strings in torch.utils.cpp_extension (#47025)
Summary:
Plus two minor fixes to `torch/csrc/Module.cpp`:
 - Use iterator of type `Py_ssize_t` for array indexing in `THPModule_initNames`
 - Fix clang-tidy warning of unneeded defaultGenerator copy by capturing it as `const auto&`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47025

Reviewed By: samestep

Differential Revision: D24605907

Pulled By: malfet

fbshipit-source-id: c276567d320758fa8b6f4bd64ff46d2ea5d40eff
2020-10-28 21:32:33 -07:00
Guilherme Leobas
789e935304 Annotate torch.nn.cpp (#46490)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/46489

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46490

Reviewed By: zhangguanheng66

Differential Revision: D24509519

Pulled By: ezyang

fbshipit-source-id: edffd32ab2ac17ae4bbd44826b71f5cb9f1da1c5
2020-10-23 17:40:32 -07:00
Jithun Nair
65da50c099 Apply hip vs hipcc compilation flags correctly for building extensions (#46273)
Summary:
Fixes issues when building certain PyTorch extensions where the cpp files do NOT compile if flags such as `__HIP_NO_HALF_CONVERSIONS__` are defined.
cc jeffdaily

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46273

Reviewed By: zou3519

Differential Revision: D24422463

Pulled By: ezyang

fbshipit-source-id: 7a43d1f7d59c95589963532ef3bd3c68cb8262be
2020-10-21 11:40:40 -07:00
Alexander Grund
5b0f400488 Replace list(map(...)) constructs by list comprehensions (#46461)
Summary:
As discussed in https://github.com/pytorch/pytorch/issues/46392 this makes the code more readable and possibly more performant.

It also fixes a bug detected by this where the argument order of `map` was confused: 030a24906e (diff-5bb26bd3a23ee3bb540aeadcc0385df2a4e48de39f87ed9ea76b21990738fe98L1537-R1537)

Fixes https://github.com/pytorch/pytorch/issues/46392

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46461

Reviewed By: ailzhang

Differential Revision: D24367015

Pulled By: ezyang

fbshipit-source-id: d55a67933cc22346b00544c9671f09982ad920e7
2020-10-19 18:42:49 -07:00
Alexandre Saint
c734961e26 [cpp-extensions] Ensure default extra_compile_args (#45956)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/45835

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45956

Reviewed By: ngimel

Differential Revision: D24162289

Pulled By: albanD

fbshipit-source-id: 9ba2ad51e818864f6743270212ed94d86457f4e6
2020-10-09 07:33:28 -07:00
Xiang Gao
2fa062002e CUDA BFloat16 infrastructure (#44925)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44925

Reviewed By: agolynski

Differential Revision: D23783910

Pulled By: ngimel

fbshipit-source-id: dacac2ad87d58056bdc68bfe0b7ab1de5c2af0d8
2020-10-02 16:21:30 -07:00
Xiang Gao
0a15646e15 CUDA RTX30 series support (#45489)
Summary:
I also opened a PR on cmake upstream: https://gitlab.kitware.com/cmake/cmake/-/merge_requests/5292

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45489

Reviewed By: zhangguanheng66

Differential Revision: D23997844

Pulled By: ezyang

fbshipit-source-id: 4e7443dde9e70632ee429184f0d51cb9aa5a98b5
2020-09-29 18:19:23 -07:00
Xiang Gao
20ac736200 Remove py2 compatible future imports (#44735)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44735

Reviewed By: mruberry

Differential Revision: D23731306

Pulled By: ezyang

fbshipit-source-id: 0ba009a99e475ddbe22981be8ac636f8a1c8b02f
2020-09-16 12:55:57 -07:00
Nikita Shulga
4134b7abfa Pass CC env variable as ccbin argument to nvcc (#43931)
Summary:
This is the common behavior when one builds PyTorch (or any other CUDA project) using CMake, so it should be held true for Torch CUDA extensions as well.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43931

Reviewed By: ezyang, seemethere

Differential Revision: D23441793

Pulled By: malfet

fbshipit-source-id: 1af392107a94840331014fda970ef640dc094ae4
2020-09-01 17:26:08 -07:00
Akihiro Nitta
f17d7a5556 Fix exception chaining in torch/ (#43836)
Summary:
## Motivation
Fixes https://github.com/pytorch/pytorch/issues/43770.

## Description of the change
This PR fixes exception chaining only in files under `torch/` where appropriate.
To fix exception chaining, I used either:
1. `raise new_exception from old_exception` where `new_exception` itself seems not descriptive enough to debug or `old_exception` delivers valuable information.
2. `raise new_exception from None` where raising both of `new_exception` and `old_exception` seems a bit noisy and redundant.
I subjectively chose which one to use from the above options.

## List of lines containing raise in except clause:
I wrote [this simple script](https://gist.github.com/akihironitta/4223c1b32404b36c1b349d70c4c93b4d) using [ast](https://docs.python.org/3.8/library/ast.html#module-ast) to list lines where `raise`ing in `except` clause.

- [x] 000739c31a/torch/jit/annotations.py (L35)
- [x] 000739c31a/torch/jit/annotations.py (L150)
- [x] 000739c31a/torch/jit/annotations.py (L158)
- [x] 000739c31a/torch/jit/annotations.py (L231)
- [x] 000739c31a/torch/jit/_trace.py (L432)
- [x] 000739c31a/torch/nn/utils/prune.py (L192)
- [x] 000739c31a/torch/cuda/nvtx.py (L7)
- [x] 000739c31a/torch/utils/cpp_extension.py (L1537)
- [x] 000739c31a/torch/utils/tensorboard/_pytorch_graph.py (L292)
- [x] 000739c31a/torch/utils/data/dataloader.py (L835)
- [x] 000739c31a/torch/utils/data/dataloader.py (L849)
- [x] 000739c31a/torch/utils/data/dataloader.py (L856)
- [x] 000739c31a/torch/testing/_internal/common_utils.py (L186)
- [x] 000739c31a/torch/testing/_internal/common_utils.py (L189)
- [x] 000739c31a/torch/testing/_internal/common_utils.py (L424)
- [x] 000739c31a/torch/testing/_internal/common_utils.py (L1279)
- [x] 000739c31a/torch/testing/_internal/common_utils.py (L1283)
- [x] 000739c31a/torch/testing/_internal/common_utils.py (L1356)
- [x] 000739c31a/torch/testing/_internal/common_utils.py (L1388)
- [x] 000739c31a/torch/testing/_internal/common_utils.py (L1391)
- [ ] 000739c31a/torch/testing/_internal/common_utils.py (L1412)
- [x] 000739c31a/torch/testing/_internal/codegen/random_topo_test.py (L310)
- [x] 000739c31a/torch/testing/_internal/codegen/random_topo_test.py (L329)
- [x] 000739c31a/torch/testing/_internal/codegen/random_topo_test.py (L332)
- [x] 000739c31a/torch/testing/_internal/jit_utils.py (L183)
- [x] 000739c31a/torch/testing/_internal/common_nn.py (L4789)
- [x] 000739c31a/torch/onnx/utils.py (L367)
- [x] 000739c31a/torch/onnx/utils.py (L659)
- [x] 000739c31a/torch/onnx/utils.py (L892)
- [x] 000739c31a/torch/onnx/utils.py (L897)
- [x] 000739c31a/torch/serialization.py (L108)
- [x] 000739c31a/torch/serialization.py (L754)
- [x] 000739c31a/torch/distributed/rpc/_testing/faulty_agent_backend_registry.py (L76)
- [x] 000739c31a/torch/distributed/rpc/backend_registry.py (L260)
- [x] 000739c31a/torch/distributed/distributed_c10d.py (L184)
- [x] 000739c31a/torch/_utils_internal.py (L57)
- [x] 000739c31a/torch/hub.py (L494)
- [x] 000739c31a/torch/contrib/_tensorboard_vis.py (L16)
- [x] 000739c31a/torch/distributions/lowrank_multivariate_normal.py (L100)
- [x] 000739c31a/torch/distributions/constraint_registry.py (L142)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43836

Reviewed By: ailzhang

Differential Revision: D23431212

Pulled By: malfet

fbshipit-source-id: 5f7f41b391164a5ad0efc06e55cd58c23408a921
2020-08-31 20:26:23 -07:00
Nikita Shulga
6753157c5a Enable torch.utils typechecks (#42960)
Summary:
Fix typos in torch.utils/_benchmark/README.md
Add empty __init__.py to examples folder to make example invocations from README.md correct
Fixed uniform distribution logic generation when mixval and maxval are None

Fixes https://github.com/pytorch/pytorch/issues/42984

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42960

Reviewed By: seemethere

Differential Revision: D23095399

Pulled By: malfet

fbshipit-source-id: 0546ce7299b157d9a1f8634340024b10c4b7e7de
2020-08-13 15:24:56 -07:00
Ralf Gommers
bcab2d6848 And type annotations for cpp_extension, utils.data, signal_handling (#42647)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42647

Reviewed By: ezyang

Differential Revision: D22967041

Pulled By: malfet

fbshipit-source-id: 35e124da0be56934faef56834a93b2b400decf66
2020-08-06 09:42:07 -07:00
Thomas Viehmann
0f78e596ba ROCm: Fix linking of custom ops in load_inline (#41257)
Summary:
Previously we did not link against amdhip64 (roughly equivalent to cudart). Apparently, the recent RTDL_GLOBAL fixes prevent the extensions from finding the symbols needed for launching kernels.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41257

Reviewed By: zou3519

Differential Revision: D22573288

Pulled By: ezyang

fbshipit-source-id: 89f9329b2097df26785e2f67e236d60984d40fdd
2020-07-17 12:14:50 -07:00
Edward Yang
22c7d183f7 If ninja is being used, force build_ext to run. (#40837)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40837

As ninja has accurate dependency tracking, if there is nothing to do,
then we will very quickly noop.  But this is important for correctness:
if a change was made to a header that is not listed explicitly in
the distutils Extension, then distutils will come to the wrong
conclusion about whether or not recompilation is needed (but Ninja
will work it out.)

This caused https://github.com/pytorch/vision/issues/2367

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D22340930

Pulled By: ezyang

fbshipit-source-id: 481b74f6e2cc78159d2a74d413751cf7cf16f592
2020-07-07 09:49:31 -07:00
Pavel Belevich
95e51bb7f8 change BuildExtension.with_options to return a class not a c-tor (#40121)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40121

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D22076634

Pulled By: pbelevich

fbshipit-source-id: a89740baf75208065e418d7f972eeb52db9ee3cf
2020-06-17 12:09:09 -07:00
lixinyu
7cb4eae8b1 correct some cpp extension code usages and documents (#39766)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39766

Test Plan: Imported from OSS

Differential Revision: D21967284

Pulled By: glaringlee

fbshipit-source-id: 8597916bee247cb5f8c82ed8297119d2f3a72170
2020-06-10 08:31:22 -07:00
Xiang Gao
b3fac8af6b Initial support for building on Ampere GPU, CUDA 11, cuDNN 8 (#39277)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39277

This PR contains initial changes that makes PyTorch build with Ampere GPU, CUDA 11, and cuDNN 8.
TF32 related features will not be included in this PR.

Test Plan: Imported from OSS

Differential Revision: D21832814

Pulled By: malfet

fbshipit-source-id: 37f9c6827e0c26ae3e303580f666584230832d06
2020-06-02 10:03:42 -07:00
ashishfarmer
53b55d8f38 Use ninja build as default for HIPExtensions (#38939)
Summary:
This PR adds the following changes:
1. It sets the default extension build to use ninja
2. Adds HIPCC flags to the host code compile string for ninja builds. This is needed when host code makes HIP API calls

cc: ezyang jeffdaily
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38939

Differential Revision: D21721905

Pulled By: ezyang

fbshipit-source-id: 75206838315a79850ecf86a78391a31ba5ee97cb
2020-05-27 11:35:19 -07:00
Yuxin Wu
0e2a0478af Support paths with spaces when building ninja extension (#38670)
Summary:
Generate the following `build.ninja` file and can successfully build:
```
cflags = -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA '-I/scratch/yuxinwu/space space/detectron2/layers/csrc' -I/private/home/yuxinwu/miniconda3/lib/python3.7
/site-packages/torch/include -I/private/home/yuxinwu/miniconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/private/home/yuxinwu/miniconda3/lib/python3.7/site-packages/torc
h/include/TH -I/private/home/yuxinwu/miniconda3/lib/python3.7/site-packages/torch/include/THC -I/public/apps/cuda/10.1/include -I/private/home/yuxinwu/miniconda3/include/python3.7m -c
post_cflags = -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
cuda_cflags = -DWITH_CUDA '-I/scratch/yuxinwu/space space/detectron2/layers/csrc' -I/private/home/yuxinwu/miniconda3/lib/python3.7/site-packages/torch/include -I/private/home/yuxinwu/miniconda3/li
b/python3.7/site-packages/torch/include/torch/csrc/api/include -I/private/home/yuxinwu/miniconda3/lib/python3.7/site-packages/torch/include/TH -I/private/home/yuxinwu/miniconda3/lib/python3.7/site
-packages/torch/include/THC -I/public/apps/cuda/10.1/include -I/private/home/yuxinwu/miniconda3/include/python3.7m -c
cuda_post_cflags = -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_
OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -ccbin=/public/apps/gcc/7.1.0/bin/gcc -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
-gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_70,code=sm_70 -std=c++14
ldflags =

rule compile
  command = $cxx -MMD -MF $out.d $cflags -c $in -o $out $post_cflags
  depfile = $out.d
  deps = gcc

rule cuda_compile
  command = $nvcc $cuda_cflags -c $in -o $out $cuda_post_cflags

build /scratch/yuxinwu/space$ space/build/temp.linux-x86_64-3.7/scratch/yuxinwu/space$ space/detectron2/layers/csrc/vision.o: compile /scratch/yuxinwu/space$ space/detectron2/layers/csrc/vision.c$
p
build /scratch/yuxinwu/space$ space/build/temp.linux-x86_64-3.7/scratch/yuxinwu/space$ space/detectron2/layers/csrc/box_iou_rotated/box_iou_rotated_cpu.o: compile /scratch/yuxinwu/space$ space/de$
ectron2/layers/csrc/box_iou_rotated/box_iou_rotated_cpu.cpp
build /scratch/yuxinwu/space$ space/build/temp.linux-x86_64-3.7/scratch/yuxinwu/space$ space/detectron2/layers/csrc/ROIAlignRotated/ROIAlignRotated_cpu.o: compile /scratch/yuxinwu/space$ space/de$
ectron2/layers/csrc/ROIAlignRotated/ROIAlignRotated_cpu.cpp
build /scratch/yuxinwu/space$ space/build/temp.linux-x86_64-3.7/scratch/yuxinwu/space$ space/detectron2/layers/csrc/nms_rotated/nms_rotated_cpu.o: compile /scratch/yuxinwu/space$ space/detectron2$
layers/csrc/nms_rotated/nms_rotated_cpu.cpp
build /scratch/yuxinwu/space$ space/build/temp.linux-x86_64-3.7/scratch/yuxinwu/space$ space/detectron2/layers/csrc/ROIAlign/ROIAlign_cpu.o: compile /scratch/yuxinwu/space$ space/detectron2/layer$
/csrc/ROIAlign/ROIAlign_cpu.cpp

```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38670

Differential Revision: D21689613

Pulled By: ppwwyyxx

fbshipit-source-id: 1f71b12433e18f6b0c6aad5e1b390b4438654563
2020-05-21 14:57:40 -07:00
peter
a40049fd2a Better handling for msvc env when compiling cpp extensions (#38862)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/38861#issuecomment-631934636.
1. Error out if msvc env is activated but `DISTUTILS_USE_SDK` is not set.
2. Attempt to activate msvc env before running ninja build
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38862

Differential Revision: D21686343

Pulled By: ezyang

fbshipit-source-id: 38b366654e2d0376dbdd21276689772b78e9718e
2020-05-21 12:52:22 -07:00
peter
4e46c95826 Fix cpp extension build failure if path contains space (#38860)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38860

Differential Revision: D21686335

Pulled By: ezyang

fbshipit-source-id: 2675f4f70b48ae3b58ea597a2b584b446d03c704
2020-05-21 12:36:27 -07:00
lixinyu
5a979fcb99 allow user passing relative paths in include_dirs within setuptools.setup (#38264)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38264

Test Plan: Imported from OSS

Differential Revision: D21509277

Pulled By: glaringlee

fbshipit-source-id: b0bc17d375a89b96b1bdacde5987b4f4baa9468e
2020-05-13 20:00:12 -07:00
ashish
5a386a0a78 Fix ldflags string for HIPExtensions (#38047)
Summary:
This pull request adds a check for ROCm environment and skips adding CUDA specific flags for the scenario when a pytorch extension is built on ROCm.

ezyang jeffdaily
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38047

Differential Revision: D21470507

Pulled By: ezyang

fbshipit-source-id: 5af2d7235e306c7aa9a5f7fc8760025417383069
2020-05-07 20:39:01 -07:00
ashishfarmer
402f635bbe Enable ahead of time compilation for HIPExtensions using ninja (#37800)
Summary:
This pull request enables ahead of time compilation of HIPExtensions with ninja by setting appropriate compilation flags for ROCm environment. Also, this enables the unit test for testing cuda_extensions on ROCm as well as removing test for ahead of time compilation of extensions with ninja from ROCM_BLACKLIST

ezyang jeffdaily
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37800

Differential Revision: D21408148

Pulled By: soumith

fbshipit-source-id: 146f4ffb3418f3534e6ce86805d3fe9c3eae84e1
2020-05-05 20:53:35 -07:00
peter
7c4bda7e6f Eliminate warnings for cpp extensions on Windows (#37400)
Summary:
Improve the readability of the logs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37400

Differential Revision: D21302597

Pulled By: ezyang

fbshipit-source-id: b8cbd33f95b6839ad4c6930bed8750c9b5a2ef7a
2020-04-30 20:28:03 -07:00
SsnL
13013848d5 Fix cpp_ext build dir create permission (#34239)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/34238
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34239

Differential Revision: D21328036

Pulled By: soumith

fbshipit-source-id: dac2735383b1a689139af5a23f61ccbebd1fd6c1
2020-04-30 11:30:07 -07:00
Lukas Koestler
0048243f70 Check compiler -v to determine compiler (fix #33701) (#37293)
Summary:
As described in the issue (https://github.com/pytorch/pytorch/issues/33701) the compiler check
	for building cpp extensions does not work with ccache.
	In this case we check compiler -v to determine which
	compiler is actually used and check it.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37293

Differential Revision: D21256913

Pulled By: ezyang

fbshipit-source-id: 5483a10cc2dbcff98a7f069ea9dbc0c12b6502dc
2020-04-27 10:49:04 -07:00
David Reiss
e75fb4356b Remove (most) Python 2 support from Python code (#35615)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35615

Python 2 has reached end-of-life and is no longer supported by PyTorch.
Now we can clean up a lot of cruft that we put in place to support it.
These changes were all done manually, and I skipped anything that seemed
like it would take more than a few seconds, so I think it makes sense to
review it manually as well (though using side-by-side view and ignoring
whitespace change might be helpful).

Test Plan: CI

Differential Revision: D20842886

Pulled By: dreiss

fbshipit-source-id: 8cad4e87c45895e7ce3938a88e61157a79504aed
2020-04-22 09:23:14 -07:00
Thomas Viehmann
d070c0bcf0 ROCm: enable cpp_extensions.load/load_inline (#35897)
Summary:
This enables cpp_extensions.load/load_inline. This works by hipify-ing cuda sources.
Also enable tests.
CuDNN/MIOpen extensions aren't yet supported, I propose to not do this in this PR.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35897

Differential Revision: D20983279

Pulled By: ezyang

fbshipit-source-id: a5d0f5ac592d04488a6a46522c58e2ee0a6fd57c
2020-04-13 11:44:08 -07:00
lizz
5d1205bf02 Suppress output when checking hipcc (#35789)
Summary:
Otherwise, it will print some message when hipcc is not found.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35789

Differential Revision: D20793089

Pulled By: ezyang

fbshipit-source-id: 4b3cb29fb1d74a1931603ee01e669013ccae9685
2020-04-01 13:03:21 -07:00
hainq
a0dc36e501 [Windows] Fix torch_cuda's forced link (#35659)
Summary:
The current config on `master` yields the following errors when build from source on Windows with CMake and Visual Studio 2019.
```
Severity	Code	Description	Project	File	Line	Suppression State
Error	LNK2001	unresolved external symbol \?warp_size@cuda@at@YAHXZ\	torch	D:\AI\pytorch\build_libtorch\caffe2\LINK	1
Severity	Code	Description	Project	File	Line	Suppression State
Error	LNK1120	1 unresolved externals	torch	D:\AI\pytorch\build_libtorch\bin\Release\torch.dll	1
Severity	Code	Description	Project	File	Line	Suppression State
Error	LNK2001	unresolved external symbol \?warp_size@cuda@at@YAHXZ\	caffe2_observers	D:\AI\pytorch\build_libtorch\modules\observers\LINK	1
Severity	Code	Description	Project	File	Line	Suppression State
Error	LNK1120	1 unresolved externals	caffe2_observers	D:\AI\pytorch\build_libtorch\bin\Release\caffe2_observers.dll	1
Severity	Code	Description	Project	File	Line	Suppression State
Error	LNK2001	unresolved external symbol \?warp_size@cuda@at@YAHXZ\	caffe2_detectron_ops_gpu	D:\AI\pytorch\build_libtorch\modules\detectron\LINK	1
Severity	Code	Description	Project	File	Line	Suppression State
Error	LNK1120	1 unresolved externals	caffe2_detectron_ops_gpu	D:\AI\pytorch\build_libtorch\bin\Release\caffe2_detectron_ops_gpu.dll	1
```

This change at least fixes the above errors in that specific setting. Do you think it makes sense to get this merged or will it break other settings?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35659

Differential Revision: D20735907

Pulled By: ezyang

fbshipit-source-id: eb8fa1e69aaaa5af2da3a76963ddc910bb716479
2020-03-30 13:59:31 -07:00
Nikita Shulga
0f0a5b11b8 Disable C4251 when compiling cpp_extensions on Windows (#35272)
Summary:
Otherwise, VC++ will warn that every exposed C++ symbol, for example:
```
include\c10/core/impl/LocalDispatchKeySet.h(53): warning C4251: 'c10::impl::LocalDispatchKeySet::included_': class 'c10::DispatchKeySet' needs to have dll-interface to be used by clients of struct 'c10::impl::LocalDispatchKeySet'
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35272

Test Plan: CI

Differential Revision: D20623005

Pulled By: malfet

fbshipit-source-id: b635b674159bb9654e4e1a1af4394c4f36fe35bd
2020-03-24 11:08:28 -07:00
peterjc123
9e6cd98c3f Ensure torch_cuda is linked against on Windows (#34288)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/31611.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34288

Differential Revision: D20314251

Pulled By: seemethere

fbshipit-source-id: 15ab2d4de665d553a1622a2d366148697deb6c02
2020-03-12 12:16:44 -07:00
Yuxin Wu
20b18a58f1 Update compiler warning about ABI compatibility (#34472)
Summary:
3ac4267763 already forces pytorch to use gcc>=5 everywhere
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34472

Differential Revision: D20345134

Pulled By: ezyang

fbshipit-source-id: 3ce706405e8784cac5c314500466b5f988ad31bf
2020-03-10 08:12:07 -07:00