pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 00:20:18 +01:00

Author	SHA1	Message	Date
Xuehai Pan	b6bdb67f82	[BE][Easy] use `pathlib.Path` instead of `dirname` / `".."` / `pardir` (#129374 ) Changes by apply order: 1. Replace all `".."` and `os.pardir` usage with `os.path.dirname(...)`. 2. Replace nested `os.path.dirname(os.path.dirname(...))` call with `str(Path(...).parent.parent)`. 3. Reorder `.absolute()` ~/ `.resolve()`~ and `.parent`: always resolve the path first. `.parent{...}.absolute()` -> `.absolute().parent{...}` 4. Replace chained `.parent x N` with `.parents[${N - 1}]`: the code is easier to read (see 5.) `.parent.parent.parent.parent` -> `.parents[3]` 5. ~Replace `.parents[${N - 1}]` with `.parents[${N} - 1]`: the code is easier to read and does not introduce any runtime overhead.~ ~`.parents[3]` -> `.parents[4 - 1]`~ 6. ~Replace `.parents[2 - 1]` with `.parent.parent`: because the code is shorter and easier to read.~ Pull Request resolved: https://github.com/pytorch/pytorch/pull/129374 Approved by: https://github.com/justinchuby, https://github.com/malfet	2024-12-29 17:23:13 +00:00
PyTorch MergeBot	475656fd9c	Revert "[BE][Easy] use `pathlib.Path` instead of `dirname` / `".."` / `pardir` (#129374 )" This reverts commit `2293fe1024`. Reverted https://github.com/pytorch/pytorch/pull/129374 on behalf of https://github.com/malfet due to failing internal ROCM builds with error: ModuleNotFoundError: No module named hipify ([comment](https://github.com/pytorch/pytorch/pull/129374#issuecomment-2562973920))	2024-12-26 17:32:23 +00:00
PyTorch MergeBot	cc4e70b7c3	Revert "Use absolute path `path.resolve()` -> `path.absolute()` (#129409 )" This reverts commit `135c7db99d`. Reverted https://github.com/pytorch/pytorch/pull/129409 on behalf of https://github.com/malfet due to need to revert to as dependency of https://github.com/pytorch/pytorch/pull/129374 ([comment](https://github.com/pytorch/pytorch/pull/129409#issuecomment-2562969825))	2024-12-26 17:26:06 +00:00
Xuehai Pan	135c7db99d	Use absolute path `path.resolve()` -> `path.absolute()` (#129409 ) Changes: 1. Always explicit `.absolute()`: `Path(__file__)` -> `Path(__file__).absolute()` 2. Replace `path.resolve()` with `path.absolute()` if the code is resolving the PyTorch repo root directory. Pull Request resolved: https://github.com/pytorch/pytorch/pull/129409 Approved by: https://github.com/albanD	2024-12-24 08:33:08 +00:00
Xuehai Pan	2293fe1024	[BE][Easy] use `pathlib.Path` instead of `dirname` / `".."` / `pardir` (#129374 ) Changes by apply order: 1. Replace all `".."` and `os.pardir` usage with `os.path.dirname(...)`. 2. Replace nested `os.path.dirname(os.path.dirname(...))` call with `str(Path(...).parent.parent)`. 3. Reorder `.absolute()` ~/ `.resolve()`~ and `.parent`: always resolve the path first. `.parent{...}.absolute()` -> `.absolute().parent{...}` 4. Replace chained `.parent x N` with `.parents[${N - 1}]`: the code is easier to read (see 5.) `.parent.parent.parent.parent` -> `.parents[3]` 5. ~Replace `.parents[${N - 1}]` with `.parents[${N} - 1]`: the code is easier to read and does not introduce any runtime overhead.~ ~`.parents[3]` -> `.parents[4 - 1]`~ 6. ~Replace `.parents[2 - 1]` with `.parent.parent`: because the code is shorter and easier to read.~ Pull Request resolved: https://github.com/pytorch/pytorch/pull/129374 Approved by: https://github.com/justinchuby, https://github.com/malfet	2024-12-21 22:08:01 +00:00
PyTorch MergeBot	969b07b96f	Revert "[ROCm] CK Flash Attention Backend (#138947 )" This reverts commit `500d02921b`. Reverted https://github.com/pytorch/pytorch/pull/138947 on behalf of https://github.com/atalman due to Breaks default windows checkout ([comment](https://github.com/pytorch/pytorch/pull/138947#issuecomment-2548998359))	2024-12-17 16:46:57 +00:00
Andy Lugo	500d02921b	[ROCm] CK Flash Attention Backend (#138947 ) Replaces https://github.com/ROCm/pytorch/pull/1592 This PR contains the initial implementation of SDPA with composable_kernel backend. The CK path can be forced by simply calling `torch.backends.cuda.preferred_rocm_fa_library("ck")`. Similarly, you can force the incumbent aotriton implementation by passing in "aotriton" or "default". As you'd expect, not setting this option will result in aotriton to be used as the backend. In the case of CK, if pytorch deems flash attention usable, then it will use the CK path in all the same places aotriton would have been used. This PR makes no changes to the heuristics which select which attention scheme to use (i.e. flash attention vs memory efficient attention vs math etc etc). It only gets called when flash attention is both enabled (via `USE_FLASH_ATTENTION`) and is selected at runtime by the existing heuristics. Files located in pytorch/aten/src/ATen/native/transformers/hip/flash_attn/ck/mha* have been pulled from https://github.com/Dao-AILab/flash-attention courtesy of @tridao's hard work who is the co-author NOTE: In order to use this backend, the user MUST set USE_CK_FLASH_ATTENTION=1 in their environment when they build PyTorch. Pull Request resolved: https://github.com/pytorch/pytorch/pull/138947 Approved by: https://github.com/pruthvistony, https://github.com/xw285cornell, https://github.com/leitian Co-authored-by: Xiaodong Wang <xw285@cornell.edu>	2024-12-17 02:18:07 +00:00
Bin Bao	d833f49602	[reland][Inductor] Rename `cpp_wrapper_cuda.py` as `cpp_wrapper_gpu.py` (#136046 ) Summary: Reland https://github.com/pytorch/pytorch/pull/135313 after fixing internal build issues Test Plan: CI Differential Revision: D62658837 Pull Request resolved: https://github.com/pytorch/pytorch/pull/136046 Approved by: https://github.com/chenyang78, https://github.com/etaf, https://github.com/jansel	2024-09-16 14:35:19 +00:00
PyTorch MergeBot	deee21cb78	Revert "[Inductor] Rename `cpp_wrapper_cuda.py` as `cpp_wrapper_gpu.py` (#135313 )" This reverts commit `16b37b309f`. Reverted https://github.com/pytorch/pytorch/pull/135313 on behalf of https://github.com/izaitsevfb due to breaks internal builds ([comment](https://github.com/pytorch/pytorch/pull/135313#issuecomment-2349662091))	2024-09-13 17:53:21 +00:00
xinan.lin	16b37b309f	[Inductor] Rename `cpp_wrapper_cuda.py` as `cpp_wrapper_gpu.py` (#135313 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/135313 Approved by: https://github.com/jansel, https://github.com/desertfire ghstack dependencies: #135312	2024-09-11 23:59:54 +00:00
PyTorch MergeBot	a4cf9653ee	Revert "Remove Caffe2 code from tool scripts (#134941 )" This reverts commit `c818ecd169`. Reverted https://github.com/pytorch/pytorch/pull/134941 on behalf of https://github.com/kit1980 due to breaking internal builds - The path `caffe2/operators/hip/gather_op.cuh` does not exist ([comment](https://github.com/pytorch/pytorch/pull/134941#issuecomment-2332636624))	2024-09-05 21:12:54 +00:00
cyy	c818ecd169	Remove Caffe2 code from tool scripts (#134941 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/134941 Approved by: https://github.com/ezyang	2024-09-04 03:47:58 +00:00
Xuehai Pan	f6838d521a	[BE][Easy][5/19] enforce style for empty lines in import segments in `tools/` and `torchgen/` (#129756 ) See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter. You can review these PRs via: ```bash git diff --ignore-all-space --ignore-blank-lines HEAD~1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129756 Approved by: https://github.com/ezyang	2024-07-17 06:44:35 +00:00
Benson Ma	b1942a1af4	[fbgemm_gpu] Break up `fbgemm_cuda_utils.cuh`, pt 10 (#130468 ) Summary: X-link: https://github.com/pytorch/FBGEMM/pull/2814 X-link: https://github.com/facebookresearch/FBGEMM/pull/19 - Break up `fbgemm_cuda_utils.cuh`, pt 10 Test Plan: ``` buck2 targets //deeplearning/fbgemm/fbgemm_gpu/test/jagged/... \| grep -v '-' \| xargs -I % sh -c 'buck2 run @//mode/opt -c fbcode.nvcc_arch=v100 -c fbcode.platform=platform010 % \|\| exit 255' buck2 targets //deeplearning/fbgemm/fbgemm_gpu/test/tbe/... \| grep -v '-' \| xargs -I % sh -c 'buck2 run @//mode/opt -c fbcode.nvcc_arch=v100 -c fbcode.platform=platform010 % \|\| exit 255' buck2 targets //deeplearning/fbgemm/fbgemm_gpu/test/sparse/... \| grep -v '-' \| xargs -I % sh -c 'buck2 run @//mode/opt -c fbcode.nvcc_arch=v100 -c fbcode.platform=platform010 % \|\| exit 255' buck2 build --config fbcode.enable_gpu_sections=true --flagfile fbcode//mode/dev-nosan-amd-gpu fbcode//smart/inference_platform_sp/llm_predictor_amd:service buck2 build --flagfile fbcode//mode/amd-gpu fbcode//hpc/ops:sparse_ops buck2 build --flagfile fbcode//mode/dev-nosan-amd-gpu fbcode//caffe2/benchmarks/operator_benchmark/pt:add_test ``` Reviewed By: spcyppt Differential Revision: D59545097 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130468 Approved by: https://github.com/ezyang	2024-07-11 07:10:27 +00:00
PyTorch MergeBot	3d96217891	Revert "[BE][Easy] use `pathlib.Path` instead of `dirname` / `".."` / `pardir` (#129374 )" This reverts commit `9e1f3ecaa7`. Reverted https://github.com/pytorch/pytorch/pull/129374 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it is still failing with the same error ([comment](https://github.com/pytorch/pytorch/pull/129374#issuecomment-2197801405))	2024-06-29 00:47:15 +00:00
Xuehai Pan	9e1f3ecaa7	[BE][Easy] use `pathlib.Path` instead of `dirname` / `".."` / `pardir` (#129374 ) Changes by apply order: 1. Replace all `".."` and `os.pardir` usage with `os.path.dirname(...)`. 2. Replace nested `os.path.dirname(os.path.dirname(...))` call with `str(Path(...).parent.parent)`. 3. Reorder `.absolute()` ~/ `.resolve()`~ and `.parent`: always resolve the path first. `.parent{...}.absolute()` -> `.absolute().parent{...}` 4. Replace chained `.parent x N` with `.parents[${N - 1}]`: the code is easier to read (see 5.) `.parent.parent.parent.parent` -> `.parents[3]` 5. ~Replace `.parents[${N - 1}]` with `.parents[${N} - 1]`: the code is easier to read and does not introduce any runtime overhead.~ ~`.parents[3]` -> `.parents[4 - 1]`~ 6. ~Replace `.parents[2 - 1]` with `.parent.parent`: because the code is shorter and easier to read.~ Pull Request resolved: https://github.com/pytorch/pytorch/pull/129374 Approved by: https://github.com/justinchuby, https://github.com/malfet	2024-06-28 00:35:15 +00:00
PyTorch MergeBot	895316119d	Revert "[BE][Easy] use `pathlib.Path` instead of `dirname` / `".."` / `pardir` (#129374 )" This reverts commit `0314c4c101`. Reverted https://github.com/pytorch/pytorch/pull/129374 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it causes lots of internal build failures where they fail to find hipify module ([comment](https://github.com/pytorch/pytorch/pull/129374#issuecomment-2192437052))	2024-06-26 19:03:57 +00:00
Xuehai Pan	0314c4c101	[BE][Easy] use `pathlib.Path` instead of `dirname` / `".."` / `pardir` (#129374 ) Changes by apply order: 1. Replace all `".."` and `os.pardir` usage with `os.path.dirname(...)`. 2. Replace nested `os.path.dirname(os.path.dirname(...))` call with `str(Path(...).parent.parent)`. 3. Reorder `.absolute()` ~/ `.resolve()`~ and `.parent`: always resolve the path first. `.parent{...}.absolute()` -> `.absolute().parent{...}` 4. Replace chained `.parent x N` with `.parents[${N - 1}]`: the code is easier to read (see 5.) `.parent.parent.parent.parent` -> `.parents[3]` 5. ~Replace `.parents[${N - 1}]` with `.parents[${N} - 1]`: the code is easier to read and does not introduce any runtime overhead.~ ~`.parents[3]` -> `.parents[4 - 1]`~ 6. ~Replace `.parents[2 - 1]` with `.parent.parent`: because the code is shorter and easier to read.~ Pull Request resolved: https://github.com/pytorch/pytorch/pull/129374 Approved by: https://github.com/justinchuby, https://github.com/malfet	2024-06-25 08:28:38 +00:00
Yang Chen	bc7f3efb09	[aot_inductor] move CppWrapperCodeGen into a separate file (#119871 ) This reverts commit `d8e319a961`. Differential Revision: [D53817853](https://our.internmc.facebook.com/intern/diff/D53817853) Pull Request resolved: https://github.com/pytorch/pytorch/pull/119871 Approved by: https://github.com/albanD, https://github.com/khabinov ghstack dependencies: #119870	2024-02-16 08:14:20 +00:00
Yang Chen	78c9b2948a	[aot_inductor] move CudaWrapperCodeGen into a separate file (#119870 ) This reverts commit `3ab08946d5`. Differential Revision: [D53817852](https://our.internmc.facebook.com/intern/diff/D53817852) Pull Request resolved: https://github.com/pytorch/pytorch/pull/119870 Approved by: https://github.com/khabinov	2024-02-16 08:10:51 +00:00
Xinya Zhang	e3ca7346ce	Re-add initial Flash Attention support on ROCM (#115981 ) Note about the Updates: This PR: 1. skips more flash attention related UTs on MI200 2. Fix additional ATen compiling errors after hipification 3. Fix the author "root" of a specific commit 4. Includes the patch from Nikita in favor of block level static initialization. CAVEAT: This revised PR has a commit that modifies the CI to force its running on MI200 nodes. That specific commit must be reverted before merge. Original PR (https://github.com/pytorch/pytorch/pull/114309) Note: This pull requests add initial Flash Attention support for AMD/ROCM platform. It added a specialized Triton repository/branch as a compile-time dependency for Flash Attention math library on AMD/ROCM. This triton submodule is not used at runtime and will not be shipped to the final pytorch package. We have the plan to release this specialized Triton as a separate project. Know limitations: - Only supports MI200 series GPU (i.e., `gcnArchName == gfx90a:sramecc+:xnack-`. - Only supports power of two sequence lengths. - No support for varlen APIs. - Only support head dimension 16,32,64,128. - Performance is still being optimized. Fixes #112997 Pull Request resolved: https://github.com/pytorch/pytorch/pull/115981 Approved by: https://github.com/malfet	2024-01-04 22:21:31 +00:00
Jeff Daily	e3aefe2970	Revert "Initial Flash Attention support on ROCM (#114309 )" (#115975 ) This reverts commit `5bddbed399`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115975 Approved by: https://github.com/atalman, https://github.com/malfet	2023-12-16 03:40:14 +00:00
Xinya Zhang	5bddbed399	Initial Flash Attention support on ROCM (#114309 ) This pull requests add initial Flash Attention support for AMD/ROCM platform. It added a specialized Triton repository/branch as a compile-time dependency for Flash Attention math library on AMD/ROCM. This triton submodule is not used at runtime and will not be shipped to the final pytorch package. We have the plan to release this specialized Triton as a separate project. Know limitations: - [ ] Only supports MI200 series GPU (i.e., `gcnArchName == gfx90a:sramecc+:xnack-`. - [ ] Only supports power of two sequence lengths. - [ ] No support for varlen APIs. - [ ] Only support head dimension 16,32,64,128. - [ ] Performance is still being optimized. Fixes https://github.com/pytorch/pytorch/issues/112997 Pull Request resolved: https://github.com/pytorch/pytorch/pull/114309 Approved by: https://github.com/jeffdaily, https://github.com/malfet --------- Co-authored-by: Joseph Groenenboom <joseph.groenenboom@amd.com>	2023-12-14 08:52:57 -08:00
Jack Taylor	4a4c9fb0b8	[ROCm] Add ROCm AMDGPU support for inductor cpp codegen (#105141 ) Follows from previous enablement attempt: https://github.com/pytorch/pytorch/pull/101797 Adds support for hsaco binaries in inductor's cpp_wrapper codegen and enables the CUDA tests in test_cpp_wrapper. This PR also brings in additional required hipify mappings for the wrapper codegen file. NOTE: we can unskip some of these tests when we enabled MI210 runners. Pull Request resolved: https://github.com/pytorch/pytorch/pull/105141 Approved by: https://github.com/jansel, https://github.com/malfet	2023-11-29 15:11:24 +00:00
Jeff Daily	28c0b07d19	[ROCm] remove HCC references (#111975 ) - rename `__HIP_PLATFORM_HCC__` to `__HIP_PLATFORM_AMD__` - rename `HIP_HCC_FLAGS` to `HIP_CLANG_FLAGS` - rename `PYTORCH_HIP_HCC_LIBRARIES` to `PYTORCH_HIP_LIBRARIES` - workaround in tools/amd_build/build_amd.py until submodules are updated These symbols have had a long deprecation cycle and will finally be removed in ROCm 6.0. Pull Request resolved: https://github.com/pytorch/pytorch/pull/111975 Approved by: https://github.com/ezyang, https://github.com/hongxiayang	2023-10-26 02:39:10 +00:00
wangxiyuan	5589b81173	Remove redundant change for gloo (#106750 ) HIP deprecated symbols are removed by `d74270ece2` and `fe2ad9c328` which is included in pytorch gloo already. gloo in pytorch master: `597accfd79` There is no need to fix it in pytorch now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/106750 Approved by: https://github.com/jithunnair-amd, https://github.com/kit1980	2023-09-26 03:46:14 +00:00
PyTorch MergeBot	5a7c008b30	Revert "[ROCm] Add ROCm AMDGPU support for inductor cpp codegen (#105141 )" This reverts commit `8ff00360a4`. Reverted https://github.com/pytorch/pytorch/pull/105141 on behalf of https://github.com/DanilBaibak due to Break internal build ([comment](https://github.com/pytorch/pytorch/pull/105141#issuecomment-1715629007))	2023-09-12 12:29:55 +00:00
Jack Taylor	8ff00360a4	[ROCm] Add ROCm AMDGPU support for inductor cpp codegen (#105141 ) Follows from previous enablement attempt: https://github.com/pytorch/pytorch/pull/101797 Adds support for hsaco binaries in inductor's cpp_wrapper codegen and enables the CUDA tests in test_cpp_wrapper. Pull Request resolved: https://github.com/pytorch/pytorch/pull/105141 Approved by: https://github.com/jansel	2023-09-09 16:28:56 +00:00
Justin Chu	4cc1745b13	[BE] f-stringify torch/ and scripts (#105538 ) This PR is a follow up on the pyupgrade series to convert more strings to use f-strings using `flynt`. - https://docs.python.org/3/reference/lexical_analysis.html#f-strings - https://pypi.org/project/flynt/ Command used: ``` flynt torch/ -ll 120 flynt scripts/ -ll 120 flynt tools/ -ll 120 ``` and excluded `collect_env.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/105538 Approved by: https://github.com/ezyang, https://github.com/malfet	2023-07-21 19:35:24 +00:00
Justin Chu	14d87bb5ff	[BE] Enable ruff's UP rules and autoformat tools and scripts (#105428 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105428 Approved by: https://github.com/albanD, https://github.com/soulitzer, https://github.com/malfet	2023-07-19 01:24:44 +00:00
BowenBao	60a68477a6	Bump black version to 23.1.0 (#96578 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/96578 Approved by: https://github.com/ezyang	2023-03-15 06:27:59 +00:00
jjsjann123	c11b301bcd	[NVFUSER] refactor nvfuser build (#89621 ) This PR is the first step towards refactors the build for nvfuser in order to have the coegen being a standalone library. Contents inside this PR: 1. nvfuser code base has been moved to `./nvfuser`, from `./torch/csrc/jit/codegen/cuda/`, except for registration code for integration (interface.h/interface.cpp) 2. splits the build system so nvfuser is generating its own `.so` files. Currently there are: - `libnvfuser_codegen.so`, which contains the integration, codegen and runtime system of nvfuser - `nvfuser.so`, which is nvfuser's python API via pybind. Python frontend is now exposed via `nvfuser._C.XXX` instead of `torch._C._nvfuser` 3. nvfuser cpp tests is currently being compiled into `nvfuser_tests` 4. cmake is refactored so that: - nvfuser now has its own `CMakeLists.txt`, which is under `torch/csrc/jit/codegen/cuda/`. - nvfuser backend code is not compiled inside `libtorch_cuda_xxx` any more - nvfuser is added as a subdirectory under `./CMakeLists.txt` at the very end after torch is built. - since nvfuser has dependency on torch, the registration of nvfuser at runtime is done via dlopen (`at::DynamicLibrary`). This avoids circular dependency in cmake, which will be a nightmare to handle. For details, look at `torch/csrc/jit/codegen/cuda/interface.cpp::LoadingNvfuserLibrary` Future work that's scoped in following PR: - Currently since nvfuser codegen has dependency on torch, we need to refactor that out so we can move nvfuser into a submodule and not rely on dlopen to load the library. @malfet - Since we moved nvfuser into a cmake build, we effectively disabled bazel build for nvfuser. This could impact internal workload at Meta, so we need to put support back. cc'ing @vors Pull Request resolved: https://github.com/pytorch/pytorch/pull/89621 Approved by: https://github.com/davidberard98	2023-01-26 02:50:44 +00:00
Jeff Daily	d09486ab23	[ROCm] enable nvfuser (#82498 ) ### Description The nvfuser is enabled for ROCm. ### Testing CI label ciflow/trunk covers the newly enabled ROCm functionality as well as any CUDA regressions caused by these changes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82498 Approved by: https://github.com/jjsjann123, https://github.com/davidberard98	2022-08-30 21:50:39 +00:00
Xinya Zhang	ec99a8003a	[ROCM] Improvements of incremental hipification and build (#82190 ) ### Description Improve the incremental build process on ROCM by eliminating unnecessary file changes. ### Issue N/A ### Testing 1. Run `python tools/amd_build/build_amd.py --out-of-place-only` multiple times, and ensure File `third_party/gloo/cmake/Modules/Findrccl.cmake` does not contain patterns like `RCCL_LIBRARY_PATH_PATH` 2. Run `python tools/amd_build/build_amd.py; USE_ROCM=1 python3 setup.py develop` twice, and confirm the second run does not trigger the compiling of thousands of files. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82190 Approved by: https://github.com/jithunnair-amd, https://github.com/ezyang	2022-07-27 13:37:40 +00:00
Huy Do	347b036350	Apply ufmt linter to all py files under tools (#81285 ) With ufmt in place https://github.com/pytorch/pytorch/pull/81157, we can now use it to gradually format all files. I'm breaking this down into multiple smaller batches to avoid too many merge conflicts later on. This batch (as copied from the current BLACK linter config): * `tools/*/.py` Upcoming batchs: * `torchgen/*/.py` * `torch/package/*/.py` * `torch/onnx/*/.py` * `torch/_refs/*/.py` * `torch/_prims/*/.py` * `torch/_meta_registrations.py` * `torch/_decomp/*/.py` * `test/onnx/*/.py` Once they are all formatted, BLACK linter will be removed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81285 Approved by: https://github.com/suo	2022-07-13 07:59:22 +00:00
PyTorch MergeBot	ec4be38ba9	Revert "To add hipify_torch as a submodule in pytorch/third_party (#74704 )" This reverts commit `93b0fec39d`. Reverted https://github.com/pytorch/pytorch/pull/74704 on behalf of https://github.com/malfet due to broke torchvision	2022-06-21 23:54:00 +00:00
Bhavya Medishetty	93b0fec39d	To add hipify_torch as a submodule in pytorch/third_party (#74704 ) `hipify_torch` as a submodule in `pytorch/third_party` Pull Request resolved: https://github.com/pytorch/pytorch/pull/74704 Approved by: https://github.com/jeffdaily, https://github.com/malfet	2022-06-21 18:56:49 +00:00
Shintaro Iwasaki	20e4d6c4dc	[PyTorch][AMD] fix hipify_python (#76720 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76720 This PR fixes an issue in hipify_python introduced by https://github.com/pytorch/pytorch/pull/76141. https://github.com/pytorch/pytorch/pull/76141 made all the `includes` paths "absolute", but this was not done for `args.extra_include_dir`; `new_dir`, which is a relative path, is directly added to `includes`. This PR fixes it by passing the absolute path (`abs_new_dir`). Test Plan: CI Reviewed By: albanD Differential Revision: D36089556 fbshipit-source-id: 1607075a4cb13696c1b25923f56b08a8cb3c6578 (cherry picked from commit 2ca648728f01c03320015f90d33404e75f978206)	2022-05-03 22:59:10 +00:00
rraminen	7422ccea8b	Hipify fixes for a successful DeepSpeed build These commits are required to build DeepSpeed on ROCm without the hipify errors. `a41829d9ed` `663c718462` cc: @jeffdaily Pull Request resolved: https://github.com/pytorch/pytorch/pull/76141 Approved by: https://github.com/jeffdaily, https://github.com/pruthvistony, https://github.com/albanD	2022-04-28 13:19:59 +00:00
dzdang	6e292f1a21	[quant][core][gpu][improvement] Integrated quantized cudnn max pool2d with existing quantized_max_pool2d (#76129 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76129 Previously, quantized_max_pool2d_cudnn was made available to the frontend through torch.ops.quantized.max_pool2d. We improve the integration by also making it available through torch.max_pool2d, which is made possible by registering quantized_max_pool2d_cudnn in native_functions.yaml under quantized_max_pool2d, which is called in max_pool2d. Ideally and ultimately, we will get rid of the quantized_max_pool2d registration in native_functions.yaml, and directly register quantized_max_pool2d and quantized_max_pool2d_cudnn under max_pool2d, but current support for quantized dispatch keys blocks us from doing so. Test Plan: ``` python test/run_tests.py ``` ``` python test/run_tests.py ``` Differential Revision: D35789078 D35789078 Reviewed By: jerryzh168 Pulled By: dzdang fbshipit-source-id: 5d8220255bfab663b4779b5d3c66dea9f79d8ee7 (cherry picked from commit c27164da29043f7dc9a4c27d24a93cd37162c23e)	2022-04-27 01:52:45 +00:00
Scott Wolchok	e816e17655	[PyTorch] Add native fast path for transformer encoder inference (#76333 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76333 The current PyTorch multi-head attention and transformer implementations are slow. This should speed them up for inference. ghstack-source-id: 154737857 (Note: this ignores all push blocking failures!) Test Plan: CI Reviewed By: cpuhrsch Differential Revision: D35239925 fbshipit-source-id: 5a7eb8ff79bc6afb4b7d45075ddb2a24a6e2df28	2022-04-26 12:58:03 -04:00
Jon Janzen	2387efd356	Revert "[PyTorch] Add native fast path for transformer encoder inference" This reverts commit `b369b89f23`. This has internal changes and should not have been landed via mergebot. Ref: https://github.com/pytorch/pytorch/pull/75809#issuecomment-1108717166	2022-04-25 11:40:02 -04:00
Scott Wolchok	b369b89f23	[PyTorch] Add native fast path for transformer encoder inference Pull Request resolved: https://github.com/pytorch/pytorch/pull/75809 The current PyTorch multi-head attention and transformer implementations are slow. This should speed them up for inference. Differential Revision: [D35239925](https://our.internmc.facebook.com/intern/diff/D35239925/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D35239925/)! Approved by: https://github.com/ezyang	2022-04-25 06:11:36 +00:00
Edward Z. Yang	a11c1bbdd0	Run Black on all of tools/ Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/76089 Approved by: https://github.com/albanD	2022-04-20 17:29:41 +00:00
Scott Wolchok	97c993ca7a	[PyTorch] Add NestedTensor support functions for transformers Pull Request resolved: https://github.com/pytorch/pytorch/pull/75491 Here are the NestedTensor kernels we'll need for the improved transformer implementation. Differential Revision: [D35409275](https://our.internmc.facebook.com/intern/diff/D35409275/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D35409275/)! Approved by: https://github.com/cpuhrsch	2022-04-14 16:30:23 +00:00
Xiaodong Wang	025cd69a86	[AMD] Fix some legacy hipify script (#70594 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70594 Pull Request resolved: https://github.com/facebookincubator/gloo/pull/315 Fix some out-dated hipify script: * python -> python3 (fb internal) * rocblas return code * gloo makefile for hip clang Test Plan: Sandcastle + OSS build Reviewed By: malfet, shintaro-iwasaki Differential Revision: D33402839 fbshipit-source-id: 5893039451bcf77bbbb1b88d2e46ae3e39caa154	2022-01-05 11:34:25 -08:00
Peter Bell	560cd88195	Kill THCUNN (#63429 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63429 Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D30441308 Pulled By: ngimel fbshipit-source-id: 3ae342a2f8d5c7f8827b637c4055c5d1b0a1be26	2021-08-23 12:07:16 -07:00
Sam Estep	737d920b21	Strictly type everything in .github and tools (#59117 ) Summary: This PR greatly simplifies `mypy-strict.ini` by strictly typing everything in `.github` and `tools`, rather than picking and choosing only specific files in those two dirs. It also removes `warn_unused_ignores` from `mypy-strict.ini`, for reasons described in https://github.com/pytorch/pytorch/pull/56402#issuecomment-822743795: basically, that setting makes life more difficult depending on what libraries you have installed locally vs in CI (e.g. `ruamel`). Pull Request resolved: https://github.com/pytorch/pytorch/pull/59117 Test Plan: ``` flake8 mypy --config mypy-strict.ini ``` Reviewed By: malfet Differential Revision: D28765386 Pulled By: samestep fbshipit-source-id: 3e744e301c7a464f8a2a2428fcdbad534e231f2e	2021-06-07 14:49:36 -07:00
Jeff Daily	ba694520e5	[ROCm] fix JIT codegen (#57400 ) Summary: Fixes upcoming changes that are part of ROCm 4.2 and affect PyTorch JIT. - ROCM_VERSION macro must be available to both device and host compilation passes. - Unifies some of CUDA and HIP differences in the code generated. - NAN / POS_INFINITY / NEG_INFINITY - Do not hipify `extern __shared__` -> `HIP_DYNAMIC_SHARED()` macro [deprecated] - Differentiates bf16 codegen for HIP. - Optionally provides missing macros when using hiprtc precompiled header feature. Pull Request resolved: https://github.com/pytorch/pytorch/pull/57400 Reviewed By: ejguan Differential Revision: D28421065 Pulled By: malfet fbshipit-source-id: 215f476773c61d8b0d9d148a4e5f5d016f863074	2021-05-27 11:45:07 -07:00
Sam Estep	2e26976ad3	Disallow versionless Python shebangs (#58275 ) Summary: Some machines don't have a versionless `python` on their PATH, which breaks these existing shebangs. I'm assuming that all the existing versionless `python` shebangs are meant to be `python3` and not `python2`; please let me know if my assumption was incorrect for any of these. Pull Request resolved: https://github.com/pytorch/pytorch/pull/58275 Test Plan: CI. Reviewed By: zhouzhuojie Differential Revision: D28428143 Pulled By: samestep fbshipit-source-id: 6562be3d12924db72a92a0207b060ef740f61ebf	2021-05-14 08:26:02 -07:00

1 2

84 Commits