Commit Graph

367 Commits

Author SHA1 Message Date
Dmitry Rogozhkin
d27ecf85db xpu: support sycl with torch.utils.cpp_extension APIs (#132945)
This patch adds support for sycl kernels build via `torch.utils.cpp_extension.load`, `torch.utils.cpp_extension.load_inline` and (new) `class SyclExtension` APIs. Files having `.sycl` extension are considered to have sycl kernels and are compiled with `icpx` (dpc++ sycl compiler from Intel). Files with other extensions, `.cpp`, `.cu`, are handled as before. API supports building sycl along with other file types into single extension.

Note that `.sycl` file extension is a PyTorch convention for files containing sycl code which I propose to adopt. We did follow up with compiler team to introduce such file extension in the compiler, but they are opposed to this. At the same time discussion around sycl file extension and adding sycl language support into such tools as cmake is ongoing. Eventually cmake also considers to introduce some file extension convention for sycl. I hope we can further influence cmake and compiler communities to broader adopt `.sycl` file extension.

By default SYCL kernels are compiled for all Intel GPU devices for which pytorch native aten SYCL kernels are compiled. At the moment `pvc,xe-lpg`. This behavior can be overridden by setting `TORCH_XPU_ARCH_LIST` environment variables to the comma separated list of desired devices to compile for.

Fixes: #132944

CC: @gujinghui @EikanWang @fengyuan14 @guangyey @jgong5

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132945
Approved by: https://github.com/albanD, https://github.com/guangyey, https://github.com/malfet

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
2025-02-16 16:50:59 +00:00
PyTorch MergeBot
dd5d0ea6bb Revert "xpu: support sycl with torch.utils.cpp_extension APIs (#132945)"
This reverts commit 607379960b.

Reverted https://github.com/pytorch/pytorch/pull/132945 on behalf of https://github.com/malfet due to It just broke all the tests, see b16ae97ad0/1 ([comment](https://github.com/pytorch/pytorch/pull/132945#issuecomment-2661498747))
2025-02-16 16:03:42 +00:00
Dmitry Rogozhkin
607379960b xpu: support sycl with torch.utils.cpp_extension APIs (#132945)
This patch adds support for sycl kernels build via `torch.utils.cpp_extension.load`, `torch.utils.cpp_extension.load_inline` and (new) `class SyclExtension` APIs. Files having `.sycl` extension are considered to have sycl kernels and are compiled with `icpx` (dpc++ sycl compiler from Intel). Files with other extensions, `.cpp`, `.cu`, are handled as before. API supports building sycl along with other file types into single extension.

Note that `.sycl` file extension is a PyTorch convention for files containing sycl code which I propose to adopt. We did follow up with compiler team to introduce such file extension in the compiler, but they are opposed to this. At the same time discussion around sycl file extension and adding sycl language support into such tools as cmake is ongoing. Eventually cmake also considers to introduce some file extension convention for sycl. I hope we can further influence cmake and compiler communities to broader adopt `.sycl` file extension.

By default SYCL kernels are compiled for all Intel GPU devices for which pytorch native aten SYCL kernels are compiled. At the moment `pvc,xe-lpg`. This behavior can be overridden by setting `TORCH_XPU_ARCH_LIST` environment variables to the comma separated list of desired devices to compile for.

Fixes: #132944

CC: @gujinghui @EikanWang @fengyuan14 @guangyey @jgong5

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132945
Approved by: https://github.com/albanD, https://github.com/guangyey
2025-02-16 10:16:09 +00:00
Jane Xu
515e55e692 Set -DPy_LIMITED_API flag for py_limited_api=True extensions (#145764)
This could be BC breaking, because there was a period of time when we use py_limited_api=True but don't enforce the flag, and now that we will start enforcing the flag, people's custom extensions may fail to build.

This is strictly still better behavior, as it is sketchy to claim CPython agnosticism without the flag, but calling this out as potential people yelling at us. Ways to mitigate this risk + reasons this may not be too big a deal:
- People haven't known about py_limited_api for extensions much due to lack of docs from python so usage is low right now
- My current tutorial is in store to make new users of py_limited_api pass this flag, so it'd be a noop for them.

Test plan:
* Locally i'm confident as I tried rebuilding ao with this change and it reliably failed (cuz importing torch/extension.h is a nono)
* Unit test wise, the normal python_agnostic one I added should work

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145764
Approved by: https://github.com/ezyang, https://github.com/zou3519, https://github.com/albanD
2025-01-28 20:11:05 +00:00
H. Vetinari
e6c1e6e20e simplify torch.utils.cpp_extension.include_paths; use it in cpp_builder (#145480)
While working on conda-forge integration, I needed to look at the way the include paths are calculated, and noticed an avoidable duplication between `torch/utils/cpp_extension.py` and `torch/_inductor/cpp_builder.py`. The latter already imports the former anyway, so simply reuse the same function.

Furthermore, remove long-obsolete include-paths. AFAICT, the `/TH` headers have not existed since pytorch 1.11.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145480
Approved by: https://github.com/ezyang
2025-01-27 07:19:42 +00:00
Johnny
732c4998f3 [NVIDIA] Full Family Blackwell Support codegen (#145436)
More references:
https://github.com/NVIDIA/nccl

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145436
Approved by: https://github.com/ezyang, https://github.com/drisspg
2025-01-24 04:36:00 +00:00
Irem Yuksel
66bf7da446 Enable sleef for Win Arm64 (#144876)
Sleef module was disabled for Windows Arm64 on b021486405
This PR enables it again since the issue is no longer valid.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144876
Approved by: https://github.com/albanD, https://github.com/malfet

Co-authored-by: Ozan Aydin <148207261+ozanMSFT@users.noreply.github.com>
2025-01-23 19:22:58 +00:00
Johnny
a57133e3c7 [NVIDIA] Jetson Thor Blackwell Support codegen (#145395)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145395
Approved by: https://github.com/eqy, https://github.com/malfet
2025-01-22 20:13:19 +00:00
johnnynunez
35f5668f7e [NVIDIA] RTX50 Blackwell Support codegen (#145270)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145270
Approved by: https://github.com/ezyang
2025-01-21 21:10:05 +00:00
Aaron Orenstein
2f9d378f7b PEP585 update - torch/utils (#145201)
See #145101 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145201
Approved by: https://github.com/bobrenjc93
2025-01-21 21:04:10 +00:00
Eddie Yan
28b1960d49 [CUDA] parse arch-conditional compute-capability when building extensions (#144446)
don't choke on arch-conditional compute capabilities e.g., `sm_90a`: #144037

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144446
Approved by: https://github.com/Skylion007, https://github.com/ezyang
2025-01-09 22:05:18 +00:00
PyTorch MergeBot
99f2491af9 Revert "Use absolute path path.resolve() -> path.absolute() (#129409)"
This reverts commit 45411d1fc9.

Reverted https://github.com/pytorch/pytorch/pull/129409 on behalf of https://github.com/jeanschmidt due to Breaking internal CI, @albanD please help get this PR merged ([comment](https://github.com/pytorch/pytorch/pull/129409#issuecomment-2571316444))
2025-01-04 14:17:20 +00:00
Xuehai Pan
45411d1fc9 Use absolute path path.resolve() -> path.absolute() (#129409)
Changes:

1. Always explicit `.absolute()`: `Path(__file__)` -> `Path(__file__).absolute()`
2. Replace `path.resolve()` with `path.absolute()` if the code is resolving the PyTorch repo root directory.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129409
Approved by: https://github.com/albanD
2025-01-03 20:03:40 +00:00
PyTorch MergeBot
cc4e70b7c3 Revert "Use absolute path path.resolve() -> path.absolute() (#129409)"
This reverts commit 135c7db99d.

Reverted https://github.com/pytorch/pytorch/pull/129409 on behalf of https://github.com/malfet due to need to revert to as dependency of https://github.com/pytorch/pytorch/pull/129374 ([comment](https://github.com/pytorch/pytorch/pull/129409#issuecomment-2562969825))
2024-12-26 17:26:06 +00:00
Xuehai Pan
135c7db99d Use absolute path path.resolve() -> path.absolute() (#129409)
Changes:

1. Always explicit `.absolute()`: `Path(__file__)` -> `Path(__file__).absolute()`
2. Replace `path.resolve()` with `path.absolute()` if the code is resolving the PyTorch repo root directory.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129409
Approved by: https://github.com/albanD
2024-12-24 08:33:08 +00:00
xinan.lin
b5e159270a [AOTI XPU] Replace intel compiler with g++ to build inductor CPP wrapper in runtime. (#142322)
This PR aims to removes the de pendency on Intel Compiler at Inductor runtime. Now we only need a SYCL_HOME in runtime to find the sycl headers and libs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/142322
Approved by: https://github.com/EikanWang, https://github.com/desertfire, https://github.com/albanD
ghstack dependencies: #143491
2024-12-21 02:27:04 +00:00
atalman
dd2cd4279e Create build_directory if it does not exist when generating ninja build file (#143328)
Fixes: https://github.com/pytorch/vision/issues/8816
I am observing this failure on Windows, Python 3.13 vision builds:
```
Emitting ninja build file C:\actions-runner\_work\vision\vision\pytorch\vision\build\temp.win-amd64-cpython-313\Release\build.ninja...
error: [Errno 2] No such file or directory: 'C:\\actions-runner\\_work\\vision\\vision\\pytorch\\vision\\build\\temp.win-amd64-cpython-313\\Release\\build.ninja'
ERROR conda.cli.main_run:execute(49): `conda run packaging/windows/internal/vc_env_helper.bat python setup.py bdist_wheel` failed. (See above for error)
```

Adding the code above fixes it, confirmed by running `` python setup.py bdist_wheel`` :
```
building 'torchvision._C' extension
Emitting ninja build file C:\actions-runner\_work\vision\vision\pytorch\vision\build\temp.win-amd64-cpython-313\Release\build.ninja...
Creating build directory C:\actions-runner\_work\vision\vision\pytorch\vision\build\temp.win-amd64-cpython-313\Release
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/26] cl /showIncludes /nologo /O2 /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /wd4624 /wd4067 /wd4068 /EHsc -Dtorchvision_EXPORTS -IC:\actions-runner\_work\vision\vision\pytorch\vision\torchvision\csrc -IC:\actions-runner\_work\_temp\conda_environment_12361066769\Lib\site-packages\torch\include -IC:\actions-runner\_work\_temp\conda_environment_12361066769\Lib\site-packages\torch\include\torch\csrc\api\include -IC:\actions-runner\_work\_temp\conda_environment_12361066769\Lib\site-packages\torch\include\TH -IC:\actions-runner\_work\_temp\conda_environment_12361066769\Lib\site-packages\torch\include\THC -IC:\actions-runner\_work\_temp\conda_environment_12361066769\include -IC:\actions-runner\_work\_temp\conda_environment_12361066769\Include "-IC:\Pr
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143328
Approved by: https://github.com/kit1980, https://github.com/albanD
2024-12-17 00:20:43 +00:00
Jane Xu
be27dbf2b8 Enable CPP/CUDAExtension with py_limited_api for python agnosticism (#138088)
Getting tested with ao, but now there is a real test i added.

## What does this PR do?

We want to allow custom PyTorch extensions to be able to build one wheel for multiple Python versions, in other words, achieve python agnosticism. It turns out that there is such a way that setuptools/Python provides already! Namely, if the user promises to use only the Python limited API in their extension, they can pass in `py_limited_api` to their Extension class and to the bdist_wheel command (with a min python version) in order to build 1 wheel that will suffice across multiple Python versions.

Sounds lovely! Why don't people do that already with PyTorch? Well 2 things. This workflow is hardly documented (even searching for python agnostic specifically does not reveal many answers) so I'd expect that people simply don't know about it. But even if they did, _PyTorch_ custom Extensions would still not work because we always link torch_python, which does not abide by py_limited_api rules.

So this is where this PR comes in! We respect when the user specifies py_limited_api and skip linking torch_python under that condition, allowing users to enroll in the provided functionality I just described.

## How do I know this PR works?

I manually tested my silly little ultra_norm locally (with `import python_agnostic`) and wrote a test case for the extension showing that
- torch_python doesn't show up in the ldd tree
- no Py- symbols show up
It may be a little confusing that our test case is actually python-free (more clean than python-agnostic) but it is sufficient (and not necessary) towards showing that this change works.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138088
Approved by: https://github.com/ezyang, https://github.com/albanD
2024-12-11 18:22:55 +00:00
Jane Xu
47a571e166 Document that load_inline requires having a compiler installed (#137521)
Prompted by this forum q: https://discuss.pytorch.org/t/are-the-requirements-for-using-torch-utils-cpp-extension-with-cuda-documented-anywhere/211222

Would be curious to know if we could get more precise.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137521
Approved by: https://github.com/zou3519
2024-12-11 03:47:54 +00:00
Fabian Keller
5e8e1d725a Remove some unused type ignores (round 1) (#142325)
Over time, a large number of the existing type ignores have become irrelevant/unused/dead as a result of improvements in annotations and type checking.

Having these `# type: ignore` linger around is not ideal for two reasons:

- They are verbose/ugly syntatically.
- They could hide genuine bugs in the future, if a refactoring would actually introduce a bug but it gets hidden by the ignore.

I'm counting over 1500 unused ignores already. This is a first PR that removes some of them. Note that I haven't touched type ignores that looked "conditional" like the import challenge mentioned in https://github.com/pytorch/pytorch/pull/60006#issuecomment-2480604728. I will address these at a later point, and eventually would enable `warn_unused_ignores = True` in the mypy configuration as discussed in that comment to prevent accumulating more dead ignores going forward.

This PR should have no effect on runtime at all.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/142325
Approved by: https://github.com/Skylion007, https://github.com/janeyx99
2024-12-09 18:23:46 +00:00
drisspg
3fdc74ae29 Fix dumb typo (#142079)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/142079
Approved by: https://github.com/jainapurva, https://github.com/soulitzer
2024-12-05 00:43:49 +00:00
drisspg
0582b32f6c Enable Extension Support (#142028)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/142028
Approved by: https://github.com/ezyang, https://github.com/eqy
2024-12-04 15:54:06 +00:00
Nikita Shulga
2398e758d2 Fix access to _msvccompiler from newer distutils (#141363)
Newer versions of distutils no longer import `_msvccompiler` upon init(on Windows platform, that was not the case on other platforms even before 74), but it's still accessible if one chooses to import it directly.
Test plan:
```
% python -c 'from setuptools import distutils; print(distutils.__version__, hasattr(distutils, "_msvccompiler")); from distutils import _msvccompiler; import setuptools; print(setuptools.__version__, _msvccompiler.__file__)'
3.10.9 False
65.5.0 /usr/local/fbcode/platform010/Python3.10.framework/Versions/3.10/lib/python3.10/site-packages/setuptools/_distutils/_msvccompiler.py
```
and
```
% python -c 'from setuptools import distutils; print(distutils.__version__, hasattr(distutils, "_msvccompiler")); from distutils import _msvccompiler; import setuptools; print(setuptools.__version__, _msvccompiler.__file__)'
3.13.0 False
75.6.0 /Users/malfet/py312-venv/lib/python3.13/site-packages/setuptools/_distutils/_msvccompiler.py
```

Gave up trying to appease the linker, so rewrote it as following function:
```python
def _get_vc_env(vc_arch: str) -> dict[str, str]:
    try:
        from setuptools import distutils  # type: ignore[import]

        return distutils._msvccompiler._get_vc_env(vc_arch)  # type: ignore[no-any-return]
    except AttributeError:
        from setuptools._distutils import _msvccompiler  #type: ignore[import]

        return _msvccompiler._get_vc_env(vc_arch)  # type: ignore[no-any-return]
```

This PR also undoes setuptools version restriction introduced by  https://github.com/pytorch/pytorch/pull/136489 as premise for restriction is incorrect

Fixes https://github.com/pytorch/pytorch/issues/141319

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141363
Approved by: https://github.com/huydhn, https://github.com/atalman
2024-11-25 01:50:47 +00:00
Irem Yuksel
b021486405 Enable Windows Arm64 (#133088)
This PR enables Pytorch for Windows on Arm64 - CPU only.
Currently, there aren't any checks in place to build and test for Windows on Arm64, but we're working to implement those as soon as possible.
We recommend using [Arm Performance Libraries (APL)](https://developer.arm.com/Tools%20and%20Software/Arm%20Performance%20Libraries) as a BLAS option, which is introduced in this PR.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133088
Approved by: https://github.com/malfet

Co-authored-by: cristian panaite <panaite.cristian2000@gmail.com>
Co-authored-by: Stefan-Alin Pahontu <56953855+alinpahontu2912@users.noreply.github.com>
Co-authored-by: Ozan Aydin <148207261+ozanMSFT@users.noreply.github.com>
2024-10-24 16:10:44 +00:00
Tom Ritchford
c0582fd0f8 Remove unused Python variables in torch/[b-z]* (#136963)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136963
Approved by: https://github.com/ezyang
2024-10-19 16:45:22 +00:00
albanD
e4571e7025 Add abi flags to cpp_extension cache folder (#136890)
This is to avoid cache confusion between normal vs pydebug vs nogil builds in cpp extensions which can lead to catastrophic ABI issues.
This is rare today for people to run both normal and pydebug on the same machine, but we expect quite a few people will run normal and nogil on the same machine going forward.

This is tested locally by running each version alternatively.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136890
Approved by: https://github.com/colesbury
2024-09-28 00:49:56 +00:00
Ramana Sundararaman
be4b7e8131 Param fixes in docstring (#136097)
Fixes wrong param names in docstrings. cc: @kit1980

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136097
Approved by: https://github.com/ezyang
2024-09-21 18:56:34 +00:00
xinan.lin
67735d1ee8 [Inductor] Generalize is_cuda to specific device_type to make cpp_wrapper mode be extensible (#134693)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134693
Approved by: https://github.com/ezyang, https://github.com/EikanWang, https://github.com/jansel
2024-09-10 10:11:13 +00:00
blazej-smorawski
585c049fa3 Fix Extension attribute name in CppExtension example (#134046)
Hi! It seems there's a typo in `CppExtension` example. I think it should say `extra_link_args` instead of `extra_link_flags`. Not that I spent a few hours debugging missing kernels inside a library's fatbin or anything :D.

Please see `Extension` definition inside setuptools:
ebddeb36f7/setuptools/_distutils/extension.py (L62)

Thanks!
Błażej

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134046
Approved by: https://github.com/soulitzer
2024-08-21 13:58:16 +00:00
Syed Tousif Ahmed
42cd397a0e Loads .pyd instead of .so in MemPool test for windows (#132749)
Fixes #132650

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132749
Approved by: https://github.com/albanD
2024-08-08 14:29:56 +00:00
PyTorch MergeBot
123d9ec5bf Revert "Loads .pyd instead of .so in MemPool test for windows (#132749)"
This reverts commit 37ab0f3385.

Reverted https://github.com/pytorch/pytorch/pull/132749 on behalf of https://github.com/syed-ahmed due to Seems like periodic is still failing: 7c79e89bc5 ([comment](https://github.com/pytorch/pytorch/pull/132749#issuecomment-2274041302))
2024-08-07 18:08:44 +00:00
Syed Tousif Ahmed
37ab0f3385 Loads .pyd instead of .so in MemPool test for windows (#132749)
Fixes #132650

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132749
Approved by: https://github.com/albanD
2024-08-07 09:58:52 +00:00
Xuehai Pan
4d7bf72d93 [BE][Easy] fix ruff rule needless-bool (SIM103) (#130206)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130206
Approved by: https://github.com/malfet
2024-07-14 08:17:52 +00:00
谭九鼎
b0e5c9514d use shutil.which in check_compiler_ok_for_platform (#129069)
the same as https://github.com/pytorch/pytorch/pull/126060
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129069
Approved by: https://github.com/ezyang
2024-06-29 11:38:51 +00:00
Xu Han
b40a033c38 [cpp_extension][inductor] Fix sleef windows depends. (#128770)
# Issue:
During I'm working on enable inductor on PyTorch Windows, I found the sleef lib dependency issue.
<img width="1011" alt="image" src="https://github.com/pytorch/pytorch/assets/8433590/423bd854-3c5f-468f-9a64-a392d9b514e3">

# Analysis:
After we enabled SIMD on PyTorch Windows(https://github.com/pytorch/pytorch/pull/118980 ), the sleef functions are called from VEC headers. It bring the sleef to the dependency.

Here is a different between Windows and Linux OS.
## Linux :
Linux is default export its functions, so libtorch_cpu.so static link to sleef.a, and then It also export sleef's functions.
<img width="647" alt="image" src="https://github.com/pytorch/pytorch/assets/8433590/00ac536c-33fc-4943-a435-25590508840d">

## Windows:
Windows is by default not export its functions, and have many limitation to export functions, reference: https://github.com/pytorch/pytorch/issues/80604
We can't package sleef functions via torch_cpu.dll like Linux.

# Solution:
Acturally, we also packaged sleef static lib as a part of release. We just need to help user link to sleef.lib, it should be fine.
1. Add sleef to cpp_builder for inductor.
2. Add sleef to cpp_extension for C++ extesion.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128770
Approved by: https://github.com/jgong5, https://github.com/jansel
2024-06-17 05:44:34 +00:00
Aaron Orenstein
8db9dfa2d7 Flip default value for mypy disallow_untyped_defs [9/11] (#127846)
See #127836 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127846
Approved by: https://github.com/ezyang
ghstack dependencies: #127842, #127843, #127844, #127845
2024-06-08 18:50:06 +00:00
cyy
d44daebdbc [Submodule] Remove deprecated USE_TBB option and TBB submodule (#127051)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/127051
Approved by: https://github.com/cpuhrsch, https://github.com/malfet
2024-05-31 01:20:45 +00:00
PyTorch MergeBot
67739d8c6f Revert "[Submodule] Remove deprecated USE_TBB option and TBB submodule (#127051)"
This reverts commit 699db7988d.

Reverted https://github.com/pytorch/pytorch/pull/127051 on behalf of https://github.com/PaliC due to This PR needs to be synced using the import button as there is a bug in our diff train ([comment](https://github.com/pytorch/pytorch/pull/127051#issuecomment-2138496995))
2024-05-30 01:16:57 +00:00
cyy
699db7988d [Submodule] Remove deprecated USE_TBB option and TBB submodule (#127051)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/127051
Approved by: https://github.com/cpuhrsch, https://github.com/malfet
2024-05-29 11:58:03 +00:00
PyTorch MergeBot
cdbb2c9acc Revert "[Submodule] Remove deprecated USE_TBB option and TBB submodule (#127051)"
This reverts commit 4fdbaa794f.

Reverted https://github.com/pytorch/pytorch/pull/127051 on behalf of https://github.com/PaliC due to This PR needs to be synced using the import button as there is a bug in our diff train ([comment](https://github.com/pytorch/pytorch/pull/127051#issuecomment-2136428735))
2024-05-29 03:02:35 +00:00
cyy
4fdbaa794f [Submodule] Remove deprecated USE_TBB option and TBB submodule (#127051)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/127051
Approved by: https://github.com/cpuhrsch, https://github.com/malfet
2024-05-27 03:54:03 +00:00
Isuru Fernando
e3c96935c2 Support CUDA_INC_PATH env variable when compiling extensions (#126808)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/126808
Approved by: https://github.com/amjames, https://github.com/ezyang
2024-05-22 02:44:32 +00:00
Daniele Trifirò
3183d65ac0 use shutil.which in _find_cuda_home (#126060)
Replace `subprocess.check_output` call with `shutil.which`, similarly to how this is done in `_find_rocm_home`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/126060
Approved by: https://github.com/r-barnes
2024-05-13 17:38:17 +00:00
Jeff Daily
ae9a4fa63c [ROCm] enforce ROCM_VERSION >= 6.0 (#125646)
Remove any code relying on ROCM_VERSION < 6.0.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/125646
Approved by: https://github.com/albanD, https://github.com/eqy
2024-05-12 18:01:28 +00:00
dlyakhov
c941fee7ea [CPP extention] Baton lock is called regardless the code version (#125404)
Greetings!

Fixes #125403

Please assist me with the testing as it is possible for my reproducer to miss the error in the code. Several (at least two) threads should enter the same part of the code at the same time to check file lock is actually working

Pull Request resolved: https://github.com/pytorch/pytorch/pull/125404
Approved by: https://github.com/ezyang
2024-05-03 21:10:39 +00:00
Nikita Shulga
35c493f2cf [CPP Extension] Escape include paths (#122974)
By using `shlex.quote` on Linux/Mac and `_nt_quote_args` on Windows

Test it by adding non-existent path with spaces and single quote

TODO: Fix double quotes on Windows (will require touching `_nt_quote_args`, so will leave it for another day

Fixes https://github.com/pytorch/pytorch/issues/122476

Pull Request resolved: https://github.com/pytorch/pytorch/pull/122974
Approved by: https://github.com/Skylion007
2024-03-30 21:58:29 +00:00
jyomu
8dd4b6a78c Fix venv compatibility issue by updating python_lib_path (#121103)
Reference by sys.executable is the absolute path of the executable binary for the Python interpreter, which may not be appropriate. Instead, sys.base_exec_prefix is more suitable, and this change will correctly resolve the library when using venv. I have tested it with a venv created by rye.

https://docs.python.org/3.6/library/sys.html#sys.executable

> A string giving the absolute path of the executable binary for the Python interpreter, on systems where this makes sense. If Python is unable to retrieve the real path to its executable, [sys.executable](https://docs.python.org/3.6/library/sys.html#sys.executable) will be an empty string or None.

https://docs.python.org/3.6/library/sys.html#sys.exec_prefix

> A string giving the site-specific directory prefix where the platform-dependent Python files are installed; by default, this is also '/usr/local'. This can be set at build time with the --exec-prefix argument to the configure script. Specifically, all configuration files (e.g. the pyconfig.h header file) are installed in the directory exec_prefix/lib/pythonX.Y/config, and shared library modules are installed in exec_prefix/lib/pythonX.Y/lib-dynload, where X.Y is the version number of Python, for example 3.2.

https://docs.python.org/3.6/library/sys.html#sys.base_exec_prefix

> Set during Python startup, before site.py is run, to the same value as [exec_prefix](https://docs.python.org/3.6/library/sys.html#sys.exec_prefix). If not running in a [virtual environment](https://docs.python.org/3.6/library/venv.html#venv-def), the values will stay the same; if site.py finds that a virtual environment is in use, the values of [prefix](https://docs.python.org/3.6/library/sys.html#sys.prefix) and [exec_prefix](https://docs.python.org/3.6/library/sys.html#sys.exec_prefix) will be changed to point to the virtual environment, whereas [base_prefix](https://docs.python.org/3.6/library/sys.html#sys.base_prefix) and [base_exec_prefix](https://docs.python.org/3.6/library/sys.html#sys.base_exec_prefix) will remain pointing to the base Python installation (the one which the virtual environment was created from).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121103
Approved by: https://github.com/ezyang
2024-03-06 17:00:46 +00:00
Han, Xu
3e382456c1 Fix compiler check (#120492)
Fixes #119304

1. Add try catch to handle the compiler version check.
2. Retry to query compiler version info.
3. Return False if can't get compiler info twice.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/120492
Approved by: https://github.com/ezyang
2024-02-25 02:41:20 +00:00
Wang, Xiao
c83af673bc Allow CUDA extension builds to skip generating cuda dependencies during compile time (#119936)
nvcc flag `--generate-dependencies-with-compile` doesn't seem to be supported by `sccache` for now. Builds with this flag enabled will not benefit from sccache.

This PR adds an environment variable that allows users to set this flag and skip those nvcc dependencies to speed up their build with compiler caches. If everything is "fresh build" in CI, we don't care if there are unnecessary recompile during incremental builds.

related: https://github.com/pytorch/pytorch/pull/49344

- [ ] todo: raise an issue to sccache

Pull Request resolved: https://github.com/pytorch/pytorch/pull/119936
Approved by: https://github.com/ezyang
2024-02-15 07:03:59 +00:00
Mark Saroufim
7fd6b1c558 s/print/warn in arch choice in cpp extension (#119463)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119463
Approved by: https://github.com/malfet
2024-02-08 20:38:51 +00:00
Nikolay Bogoychev
46ef73505d Clarify how to get extra link flags when building CUDA/C++ extension (#118743)
Make it a bit more explicit how one parse linker arguments to the build and point to the superclass documentation.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118743
Approved by: https://github.com/ezyang
2024-02-01 22:35:25 +00:00
Catherine Lee
4f5785b6b3 Enable possibly-undefined error code (#118533)
Fixes https://github.com/pytorch/pytorch/issues/118129

Suppressions automatically added with

```
import re

with open("error_file.txt", "r") as f:
    errors = f.readlines()

error_lines = {}
for error in errors:
    match = re.match(r"(.*):(\d+):\d+: error:.*\[(.*)\]", error)
    if match:
        file_path, line_number, error_type = match.groups()
        if file_path not in error_lines:
            error_lines[file_path] = {}
        error_lines[file_path][int(line_number)] = error_type

for file_path, lines in error_lines.items():
    with open(file_path, "r") as f:
        code = f.readlines()
    for line_number, error_type in sorted(lines.items(), key=lambda x: x[0], reverse=True):
        code[line_number - 1] = code[line_number - 1].rstrip() + f"  # type: ignore[{error_type}]\n"
    with open(file_path, "w") as f:
        f.writelines(code)
```

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Co-authored-by: Catherine Lee <csl@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118533
Approved by: https://github.com/Skylion007, https://github.com/zou3519
2024-01-30 21:07:01 +00:00
PyTorch MergeBot
40ece2e579 Revert "Enable possibly-undefined error code (#118533)"
This reverts commit 4f13f69a45.

Reverted https://github.com/pytorch/pytorch/pull/118533 on behalf of https://github.com/clee2000 due to sorry i'm trying to figure out a codev merge conflict, if this works i'll be back to rebase and merge ([comment](https://github.com/pytorch/pytorch/pull/118533#issuecomment-1917695185))
2024-01-30 19:00:34 +00:00
Edward Z. Yang
4f13f69a45 Enable possibly-undefined error code (#118533)
Fixes https://github.com/pytorch/pytorch/issues/118129

Suppressions automatically added with

```
import re

with open("error_file.txt", "r") as f:
    errors = f.readlines()

error_lines = {}
for error in errors:
    match = re.match(r"(.*):(\d+):\d+: error:.*\[(.*)\]", error)
    if match:
        file_path, line_number, error_type = match.groups()
        if file_path not in error_lines:
            error_lines[file_path] = {}
        error_lines[file_path][int(line_number)] = error_type

for file_path, lines in error_lines.items():
    with open(file_path, "r") as f:
        code = f.readlines()
    for line_number, error_type in sorted(lines.items(), key=lambda x: x[0], reverse=True):
        code[line_number - 1] = code[line_number - 1].rstrip() + f"  # type: ignore[{error_type}]\n"
    with open(file_path, "w") as f:
        f.writelines(code)
```

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118533
Approved by: https://github.com/Skylion007, https://github.com/zou3519
2024-01-30 05:08:10 +00:00
lancerts
e6f3a4746c include a print for _get_cuda_arch_flags (#118503)
Related to #118494, it is not clear to users that the default behavior is to include **all** feasible archs (if the 'TORCH_CUDA_ARCH_LIST' is not set).

In these scenarios, a user may experience a long build time. Adding a print statement to reflect this behavior. [`verbose` arg is not available and not feeling necessary to add `verbose` arg to this function and all its parent functions...]

Co-authored-by: Edward Z. Yang <ezyang@mit.edu>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118503
Approved by: https://github.com/ezyang
2024-01-29 07:03:56 +00:00
Kunal Tyagi
6c02520466 Remove unneeded comment and link for BuildExtension (#115496)
`BuildExtension` is no longer derived from object, but from `build_ext`. Py2 is also deprecated, so this comment wouldn't be required anyways

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115496
Approved by: https://github.com/Skylion007
2024-01-01 08:29:48 +00:00
Jeff Daily
8bff59e41d [ROCm] add hipblaslt support (#114329)
Disabled by default. Enable with env var DISABLE_ADDMM_HIP_LT=0. Tested on both ROCm 5.7 and 6.0.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114329
Approved by: https://github.com/malfet
2023-12-20 19:09:25 +00:00
PyTorch MergeBot
47908a608f Revert "[ROCm] add hipblaslt support (#114329)"
This reverts commit b062ea3803.

Reverted https://github.com/pytorch/pytorch/pull/114329 on behalf of https://github.com/jeanschmidt due to Reverting due to inconsistencies on internal diff ([comment](https://github.com/pytorch/pytorch/pull/114329#issuecomment-1861933267))
2023-12-19 01:04:58 +00:00
Jeff Daily
b062ea3803 [ROCm] add hipblaslt support (#114329)
Disabled by default. Enable with env var DISABLE_ADDMM_HIP_LT=0. Tested on both ROCm 5.7 and 6.0.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114329
Approved by: https://github.com/malfet
2023-12-15 15:36:46 +00:00
PyTorch MergeBot
59f7355f86 Revert "[ROCm] add hipblaslt support (#114329)"
This reverts commit bb2bb8cca1.

Reverted https://github.com/pytorch/pytorch/pull/114329 on behalf of https://github.com/atalman due to OSSCI oncall, trunk  tests are failing ([comment](https://github.com/pytorch/pytorch/pull/114329#issuecomment-1857003155))
2023-12-14 23:53:30 +00:00
Jeff Daily
bb2bb8cca1 [ROCm] add hipblaslt support (#114329)
Disabled by default. Enable with env var DISABLE_ADDMM_HIP_LT=0. Tested on both ROCm 5.7 and 6.0.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114329
Approved by: https://github.com/malfet
2023-12-14 21:41:22 +00:00
vfdev-5
a43c757275 Fixed error with cuda_ver in cpp_extension.py (#113555)
Reported in 71ca42787f (r132390833)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113555
Approved by: https://github.com/ezyang
2023-11-14 00:12:22 +00:00
ChanBong
5e10dd2c78 fix docstring issues in torch.utils (#113335)
Fixes #112634

Fixes all the issues listed except in `torch/utils/_pytree.py` as the file no longer exists.

### Error counts

|File | Count Before | Count now|
|---- | ---- | ---- |
|`torch/utils/collect_env.py` | 39 | 25|
|`torch/utils/cpp_extension.py` | 51 | 13|
|`torch/utils/flop_counter.py` | 25 | 8|
|`torch/utils/_foreach_utils.py.py` | 2 | 0|
|`torch/utils/_python_dispatch.py.py` | 26 | 25|
|`torch/utils/backend_registration.py` | 15 | 4|
|`torch/utils/checkpoint.py` | 29 | 21|

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113335
Approved by: https://github.com/ezyang
2023-11-13 19:37:25 +00:00
Nikita Shulga
0a7eef9bcf [BE] Remove stale CUDA version check from cpp_extension.py (#113447)
As at least CUDA-11.x is needed to build PyTorch on latest trunk.
But still skip `--generate-dependencies-with-compile` if running on ROCm

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113447
Approved by: https://github.com/Skylion007, https://github.com/atalman, https://github.com/PaliC, https://github.com/huydhn
2023-11-11 00:20:08 +00:00
PyTorch MergeBot
ae2c219de2 Revert "[BE] Remove stale CUDA version check from cpp_extension.py (#113447)"
This reverts commit 7ccca60927.

Reverted https://github.com/pytorch/pytorch/pull/113447 on behalf of https://github.com/malfet due to Broke ROCM ([comment](https://github.com/pytorch/pytorch/pull/113447#issuecomment-1806407892))
2023-11-10 20:46:13 +00:00
Nikita Shulga
7ccca60927 [BE] Remove stale CUDA version check from cpp_extension.py (#113447)
As at least CUDA-11.x is needed to build PyTorch on latest trunk

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113447
Approved by: https://github.com/Skylion007, https://github.com/atalman, https://github.com/PaliC, https://github.com/huydhn
2023-11-10 18:54:19 +00:00
vfdev
71ca42787f Replaced deprecated pkg_resources.packaging with packaging module (#113023)
Usage of `from pkg_resources import packaging` leads to a deprecation warning:
```
DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
```
and in strict tests where warnings are errors, this leads to CI breaks, e.g.: https://github.com/pytorch/vision/pull/8092

Replacing `pkg_resources.package` with `package` as it is now a pytorch dependency:
fa9045a872/requirements.txt (L19)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113023
Approved by: https://github.com/Skylion007, https://github.com/malfet
2023-11-10 15:06:03 +00:00
PyTorch MergeBot
aef9e43fe6 Revert "Replaced deprecated pkg_resources.packaging with packaging module (#113023)"
This reverts commit 81ea7a489a.

Reverted https://github.com/pytorch/pytorch/pull/113023 on behalf of https://github.com/atalman due to breaks nightlies ([comment](https://github.com/pytorch/pytorch/pull/113023#issuecomment-1802720774))
2023-11-08 21:39:59 +00:00
Alexander Grund
21b6030ac3 Don't set CUDA_HOME when not compiled with CUDA support (#106310)
It doesn't make sense to set this (on import!) as CUDA cannot be used with PyTorch in this case but leads to messages like
> No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
when CUDA happens to be installed which is at least confusing.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106310
Approved by: https://github.com/ezyang
2023-11-06 21:48:49 +00:00
vfdev
81ea7a489a Replaced deprecated pkg_resources.packaging with packaging module (#113023)
Usage of `from pkg_resources import packaging` leads to a deprecation warning:
```
DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
```
and in strict tests where warnings are errors, this leads to CI breaks, e.g.: https://github.com/pytorch/vision/pull/8092

Replacing `pkg_resources.package` with `package` as it is now a pytorch dependency:
fa9045a872/requirements.txt (L19)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113023
Approved by: https://github.com/Skylion007
2023-11-06 20:26:32 +00:00
Shaun Walbridge
0adb28b77d Show CUDAExtension example commands as code (#112764)
The default rendering of these code snippets renders the `TORCH_CUDA_ARCH_LIST` values with typographic quotes which prevent the examples from being directly copyable. Use code style for the two extension examples.

Fixes #112763
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112764
Approved by: https://github.com/malfet
2023-11-02 21:47:50 +00:00
Jeff Daily
28c0b07d19 [ROCm] remove HCC references (#111975)
- rename `__HIP_PLATFORM_HCC__` to `__HIP_PLATFORM_AMD__`
- rename `HIP_HCC_FLAGS` to `HIP_CLANG_FLAGS`
- rename `PYTORCH_HIP_HCC_LIBRARIES` to `PYTORCH_HIP_LIBRARIES`
- workaround in tools/amd_build/build_amd.py until submodules are updated

These symbols have had a long deprecation cycle and will finally be removed in ROCm 6.0.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111975
Approved by: https://github.com/ezyang, https://github.com/hongxiayang
2023-10-26 02:39:10 +00:00
Aleksei Nikiforov
ba04d84089 S390x inductor support (#111367)
Use arch compile flags. They are needed for vectorization support on s390x.
Implement new helper functions for inductor.

This change fixes multiple tests in test_cpu_repro.py

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111367
Approved by: https://github.com/ezyang
2023-10-20 19:38:46 +00:00
Aaron Gokaslan
cb856b08b2 [BE]: Attach cause to some exceptions and enable RUFF TRY200 (#111496)
Did some easy fixes from enabling TRY200. Most of these seem like oversights instead of intentional. The proper way to silence intentional errors is with `from None` to note that you thought about whether it should contain the cause and decided against it.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111496
Approved by: https://github.com/malfet
2023-10-19 21:56:36 +00:00
Kent Gauen
bb89a9e48c Skipped CUDA Flags if C++ Extension Name includes "arch" Substring (#111211)
The CUDA architecture flags from TORCH_CUDA_ARCH_LIST will be skipped if the TORCH_EXTENSION_NAME includes the substring "arch". A C++ Extension should be allowed to have any name. I just manually skip the TORCH_EXTENSION_NAME flag when checking if one of the flags is "arch". There is probably a better fix, but I'll leave this to experts.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111211
Approved by: https://github.com/ezyang
2023-10-14 00:10:01 +00:00
Dmytro Dzhulgakov
a0cea517e7 Add 9.0a to cpp_extension supported compute archs (#110587)
There's an extended compute capability 9.0a for Hopper that was introduced in Cuda 12.0: https://docs.nvidia.com/cuda/archive/12.0.0/cuda-compiler-driver-nvcc/index.html#gpu-feature-list

E.g. Cutlass leverages it: 5f13dcad78/python/cutlass/emit/pytorch.py (L684)

This adds it to the list of permitted architectures to use in `cpp_extension` directly.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110587
Approved by: https://github.com/ezyang
2023-10-05 17:41:06 +00:00
QuarticCat
20812d69e5 Fix extension rebuilding on Linux (#108613)
On Linux, CUDA header dependencies are not correctly tracked. After you modify a CUDA header, affected CUDA files won't be rebuilt. This PR will fix this problem.

```console
$ ninja -t deps
rep_penalty.o: #deps 2, deps mtime 1693956351892493247 (VALID)
    /home/qc/Workspace/NotMe/exllama/exllama_ext/cpu_func/rep_penalty.cpp
    /home/qc/Workspace/NotMe/exllama/exllama_ext/cpu_func/rep_penalty.h

rms_norm.cuda.o: #deps 0, deps mtime 1693961188871054130 (VALID)

rope.cuda.o: #deps 0, deps mtime 1693961188954388632 (VALID)

cuda_buffers.cuda.o: #deps 0, deps mtime 1693961188797719768 (VALID)

...
```

Historically, this line of code has been changed twice. It was first implemented in #49344 and there's no `if IS_WINDOWS`, just like now. Then in #56015 someone added `if IS_WINDOWS` for unknown reason. That PR has no description so I don't know what bug he encountered. I don't think there's any bug with these flags on Linux, at least for today. CMake generates exactly the same flags for CUDA.

```ninja
#############################################
# Rule for compiling CUDA files.

rule CUDA_COMPILER__cpp_cuda_unscanned_Debug
  depfile = $DEP_FILE
  deps = gcc
  command = ${LAUNCHER}${CODE_CHECK}/opt/cuda/bin/nvcc -forward-unknown-to-host-compiler $DEFINES $INCLUDES $FLAGS -MD -MT $out -MF $DEP_FILE -x cu -c $in -o $out
  description = Building CUDA object $out
```

where `-MD` is short for `--generate-dependencies-with-compile` and `-MF` is short for `--dependency-output`. My words can be verified by `nvcc --help`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108613
Approved by: https://github.com/ezyang
2023-09-06 17:58:21 +00:00
Aaron Gokaslan
660e8060ad [BE]: Update ruff to 0.285 (#107519)
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.

I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
2023-08-22 23:16:38 +00:00
PyTorch MergeBot
d59a6864fb Revert "[BE]: Update ruff to 0.285 (#107519)"
This reverts commit 88ab3e4322.

Reverted https://github.com/pytorch/pytorch/pull/107519 on behalf of https://github.com/ZainRizvi due to Sorry, but this PR breaks internal tests. @ezyang, can you please hep them get unblocked? It seems like one of the strings was prob accidentally modified ([comment](https://github.com/pytorch/pytorch/pull/107519#issuecomment-1688833480))
2023-08-22 19:53:32 +00:00
Xu Han
3f3479e85e reduce header file to boost cpp_wrapper build. (#107585)
1. Reduce cpp_wrapper un-used header files.
2. Clean pch cache, when use_pch is False.

The first change will reduce the build time from 7.35s to 4.94s.

Before change:
![image](https://github.com/pytorch/pytorch/assets/8433590/fc5c1d37-ec40-44f3-8d4d-bf26bdc674bb)
After change:
![image](https://github.com/pytorch/pytorch/assets/8433590/c7ccadd2-bf3a-4d30-bf56-6e3b0230a194)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107585
Approved by: https://github.com/ezyang, https://github.com/jansel, https://github.com/jgong5
2023-08-22 11:58:47 +00:00
Han, Xu
5ed60477a7 Optimize load inline via pch (#106696)
Add PreCompiled Header(PCH) to reduce load_inline build time.
PCH is gcc built-in mechanism: https://gcc.gnu.org/onlinedocs/gcc-4.0.4/gcc/Precompiled-Headers.html

Add PCH for '#include <torch/extension.h>'. This file will used in all load_inline modules. All load_inline modules can take benifit from this PR.

Changes:
1. Add PCH signature to guarantee PCH(gch) file take effect.
2. Unification get cxx compiler funtions.
3. Unification get build flags funtions.

Before this PR:
![image](https://github.com/pytorch/pytorch/assets/8433590/f190cdcb-236c-4312-b165-d419a7efafe3)

Added this PR:
![image](https://github.com/pytorch/pytorch/assets/8433590/b45c5ad3-e902-4fc8-b450-743cf73505a4)

Compiling time is reduced from 14.06s to 7.36s.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106696
Approved by: https://github.com/jgong5, https://github.com/jansel
2023-08-21 10:08:30 +00:00
Aaron Gokaslan
88ab3e4322 [BE]: Update ruff to 0.285 (#107519)
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.

I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
2023-08-20 01:36:18 +00:00
Nikita Shulga
bcc0f4bcab Move ASAN to clang12 and Ubuntu-22.04 (Jammy) (#106355)
- Modify `install_conda` to remove libstdc++ from libstdcxx-ng to use one from OS
- Modify `install_torchvision` to workaround weird glibc bug, where malloc interposers (such as ASAN) are causing a hang in internationalization library, see https://sourceware.org/bugzilla/show_bug.cgi?id=27653 and https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90589
- Modify `torch.utils.cpp_extension` to recognize Ubuntu's clang as supported compiler

Extracted from https://github.com/pytorch/pytorch/pull/105260
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106355
Approved by: https://github.com/huydhn
ghstack dependencies: #106354
2023-08-03 05:36:04 +00:00
Justin Chu
4cc1745b13 [BE] f-stringify torch/ and scripts (#105538)
This PR is a follow up on the pyupgrade series to convert more strings to use f-strings using `flynt`.

- https://docs.python.org/3/reference/lexical_analysis.html#f-strings
- https://pypi.org/project/flynt/

Command used:

```
flynt torch/ -ll 120
flynt scripts/ -ll 120
flynt tools/ -ll 120
```

and excluded `collect_env.py`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105538
Approved by: https://github.com/ezyang, https://github.com/malfet
2023-07-21 19:35:24 +00:00
Justin Chu
abc1cadddb [BE] Enable ruff's UP rules and autoformat utils/ (#105424)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105424
Approved by: https://github.com/ezyang, https://github.com/malfet
2023-07-18 20:17:25 +00:00
lcskrishna
004ff536e8 [ROCm] Fix circular recursion issue in hipification (#104085)
This PR fixes the circular issue during hipification process by introducing current_state to track whether a file is processed for hipification. (Iterative DFS)
The issue arises when two header files try to include themselves, which leads to a circular recursion or an infinite loop.

Fixes the related issues such as :
https://github.com/pytorch/pytorch/issues/93827
https://github.com/ROCmSoftwarePlatform/hipify_torch/issues/39

Error log:
```
  File "/opt/conda/lib/python3.8/posixpath.py", line 471, in relpath
    start_list = [x for x in abspath(start).split(sep) if x]
  File "/opt/conda/lib/python3.8/posixpath.py", line 375, in abspath
    if not isabs(path):
  File "/opt/conda/lib/python3.8/posixpath.py", line 63, in isabs
    sep = _get_sep(s)
  File "/opt/conda/lib/python3.8/posixpath.py", line 42, in _get_sep
    if isinstance(path, bytes):
RecursionError: maximum recursion depth exceeded while calling a Python object
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104085
Approved by: https://github.com/jithunnair-amd, https://github.com/malfet
2023-07-01 03:25:51 +00:00
Felix Erkinger
e140c9cc92 Fixes ROCM_HOME detection in case no hipcc is found in path (#95634)
if ROCM_HOME is not set as environment variable,
it tries to find hipcc in the path,
but fails with an empty string instead of an exception,
returning an empty string instead of harcoded '/opt/rocm' as third case

Fixes #95633

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95634
Approved by: https://github.com/jithunnair-amd, https://github.com/ezyang
2023-06-28 19:39:26 +00:00
albanD
b81f1d1bee Speed up cpp extensions re-compilation (#104280)
Fixes https://github.com/pytorch/pytorch/issues/68066 to a large extend.

This is achieved by not touching files that don't need changing to make sure the ninja caching works as expected.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104280
Approved by: https://github.com/fmassa
2023-06-28 17:06:07 +00:00
Nikita Shulga
347463fddf [cpp-extensions] Add clang to the list of supported Linux compilers (#103349)
Not sure, why was it excluded previous (oversight I guess).
Also, please note, that `clang++` is already considered acceptable compiler (as it ends with `g++` ;))

<!--
copilot:poem
-->
### <samp>🤖 Generated by Copilot at 55aa7db</samp>

> _`clang` or `gcc`, we don't care what you use_
> _We'll build our extensions with the tools we choose_
> _Don't try to stop us with your version string_
> _We'll update our logic and make our code sing_
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103349
Approved by: https://github.com/seemethere
2023-06-10 02:53:38 +00:00
Li-Huai (Allan) Lin
3c0072e7c0 [MPS] Prerequisite for MPS C++ extension (#102483)
in order to add mps kernels to torchvision codebase, we need to expose mps headers and allow objc++ files used in extensions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102483
Approved by: https://github.com/malfet
2023-06-07 17:28:31 +00:00
Matthew Hoffman
29da75cc55 Enable mypy allow redefinition (#102046)
Related #101528

I tried to enable this in another PR but it uncovered a bunch of type errors: https://github.com/pytorch/pytorch/actions/runs/4999748262/jobs/8956555243?pr=101528#step:10:1305

The goal of this PR is to fix these errors.

---

This PR enables [allow_redefinition = True](https://mypy.readthedocs.io/en/stable/config_file.html#confval-allow_redefinition) in `mypy.ini`, which allows for a common pattern:

> Allows variables to be redefined with an arbitrary type, as long as the redefinition is in the same block and nesting level as the original definition.

`allow_redefinition` allows mypy to be more flexible by allowing reassignment to an existing variable with a different type... for instance (from the linked PR):

4a1e9230ba/torch/nn/parallel/data_parallel.py (L213)

A `Sequence[Union[int, torch.device]]` is narrowed to `Sequence[int]` thru reassignment to the same variable.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102046
Approved by: https://github.com/ezyang
2023-05-24 07:05:30 +00:00
pminimd
59a3759d97 Update cpp_extension.py (#101285)
When we need to link extra libs, we should notice that 64-bit CUDA may be installed in "lib", not in "lib64".

<!--
copilot:summary
-->
### <samp>🤖 Generated by Copilot at 05c1ca6</samp>

Improve CUDA compatibility in `torch.utils.cpp_extension` by checking for `lib64` or `lib` directory.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101285
Approved by: https://github.com/ezyang, https://github.com/malfet
2023-05-15 22:47:41 +00:00
Richard Barnes
5f92909faf Use correct standard when compiling NVCC on Windows (#100031)
Test Plan: Sandcastle

Differential Revision: D45129001

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100031
Approved by: https://github.com/ngimel
2023-05-01 16:28:23 +00:00
Aaron Gokaslan
e2a3817dfd [BE] Enable C419 rule for any all shortcircuiting (#99890)
Apparently https://github.com/pytorch/pytorch/pull/78142 made torch.JIT allow for simple generator expressions which allows us to enable rules that replace unnecessary list comprehensions with generators in any/all. This was originally part of #99280 but I split it off into this PR so that it can be easily reverted should anything break.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99890
Approved by: https://github.com/justinchuby, https://github.com/kit1980, https://github.com/malfet
2023-04-25 15:02:13 +00:00
PyTorch MergeBot
cfacb5eaaa Revert "Use correct standard when compiling NVCC on Windows (#99492)"
This reverts commit db6944562e.

Reverted https://github.com/pytorch/pytorch/pull/99492 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally
2023-04-19 20:51:26 +00:00
Richard Barnes
db6944562e Use correct standard when compiling NVCC on Windows (#99492)
Test Plan: Sandcastle

Reviewed By: malfet

Differential Revision: D45108690

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99492
Approved by: https://github.com/ezyang
2023-04-19 20:36:05 +00:00
Pruthvi Madugundu
08f125bcac [ROCm] Remove usage of deprecated ROCm component header includes (#97620)
- clang parameter 'amdgpu-target' changed to 'offload-arch'
- HIP and MIOpen includes path updated for extensions

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97620
Approved by: https://github.com/ezyang, https://github.com/jithunnair-amd
2023-03-28 19:28:38 +00:00
Stas Bekman
8275e5d2a8 [cpp_extension.py] fix bogus _check_cuda_version (#97602)
Currently if `setuptools<49.4.0` and there is a minor version mismatch `_check_cuda_version` fails with a misleading non-actionable error:
```
2023-03-24T20:21:35.0625644Z   RuntimeError:
2023-03-24T20:21:35.0628441Z   The detected CUDA version (11.2) mismatches the version that was used to compile
2023-03-24T20:21:35.0630681Z   PyTorch (11.3). Please make sure to use the same CUDA versions.
```
This condition shouldn't be failing since minor version match isn't required.

It fails because the other condition to have a certain version of `setuptools` isn't met. But that condition is written in a comment (!!!). So this PR changes it to actually tell the user how to fix the problem.

While at it, I adjusted the version number as a lower `setuptools>=49.4.0` is sufficient for this to work.

Thanks.

p.s. this problem manifests on `nvidia/cuda:11.2.2-cudnn8-devel-ubuntu20.04` docker image.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97602
Approved by: https://github.com/ezyang
2023-03-27 15:15:57 +00:00
mikey dagitses
461f088c96 add -std=c++17 to windows cuda compilations (#97515)
add -std=c++17 to windows cuda compilations

Summary:
We're using C++17 in headers that are compiled by C++
extensions. Support for this was not added when we upgraded to C++17.

Test Plan: Rely on CI.

---
Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/pytorch/pytorch/pull/97515).
* #97175
* __->__ #97515
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97515
Approved by: https://github.com/ezyang
2023-03-26 15:23:52 +00:00
Kazuaki Ishizaki
622a11d512 Fix typos under torch/utils directory (#97516)
This PR fixes typos in comments and messages of `.py` files under `torch/utils` directory

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97516
Approved by: https://github.com/ezyang
2023-03-24 16:53:39 +00:00