Commit Graph

65 Commits

Author SHA1 Message Date
Yang Chen
78c9b2948a [aot_inductor] move CudaWrapperCodeGen into a separate file (#119870)
This reverts commit 3ab08946d5.

Differential Revision: [D53817852](https://our.internmc.facebook.com/intern/diff/D53817852)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119870
Approved by: https://github.com/khabinov
2024-02-16 08:10:51 +00:00
Xinya Zhang
e3ca7346ce Re-add initial Flash Attention support on ROCM (#115981)
Note about the Updates:

This PR:
1. skips more flash attention related UTs on MI200
2. Fix additional ATen compiling errors after hipification
3. Fix the author "root" of a specific commit
4. Includes the patch from Nikita in favor of block level static initialization.

CAVEAT: This revised PR has a commit that modifies the CI to force its running on MI200 nodes. That specific commit must be reverted before merge.

Original PR (https://github.com/pytorch/pytorch/pull/114309) Note:

This pull requests add initial Flash Attention support for AMD/ROCM platform. It added a specialized Triton repository/branch as a compile-time dependency for Flash Attention math library on AMD/ROCM. This triton submodule is not used at runtime and will not be shipped to the final pytorch package. We have the plan to release this specialized Triton as a separate project.

Know limitations:

- Only supports MI200 series GPU (i.e., `gcnArchName == gfx90a:sramecc+:xnack-`.
- Only supports power of two sequence lengths.
- No support for varlen APIs.
- Only support head dimension 16,32,64,128.
- Performance is still being optimized.

Fixes #112997

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115981
Approved by: https://github.com/malfet
2024-01-04 22:21:31 +00:00
Jeff Daily
e3aefe2970 Revert "Initial Flash Attention support on ROCM (#114309)" (#115975)
This reverts commit 5bddbed399.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115975
Approved by: https://github.com/atalman, https://github.com/malfet
2023-12-16 03:40:14 +00:00
Xinya Zhang
5bddbed399
Initial Flash Attention support on ROCM (#114309)
This pull requests add initial Flash Attention support for AMD/ROCM platform. It added a specialized Triton repository/branch as a compile-time dependency for Flash Attention math library on AMD/ROCM. This triton submodule is not used at runtime and will not be shipped to the final pytorch package. We have the plan to release this specialized Triton as a separate project.

Know limitations:

- [ ] Only supports MI200 series GPU (i.e., `gcnArchName == gfx90a:sramecc+:xnack-`.
- [ ] Only supports power of two sequence lengths.
- [ ] No support for varlen APIs.
- [ ] Only support head dimension 16,32,64,128.
- [ ] Performance is still being optimized.

Fixes https://github.com/pytorch/pytorch/issues/112997

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114309

Approved by: https://github.com/jeffdaily, https://github.com/malfet

---------

Co-authored-by: Joseph Groenenboom <joseph.groenenboom@amd.com>
2023-12-14 08:52:57 -08:00
Jack Taylor
4a4c9fb0b8 [ROCm] Add ROCm AMDGPU support for inductor cpp codegen (#105141)
Follows from previous enablement attempt: https://github.com/pytorch/pytorch/pull/101797

Adds support for hsaco binaries in inductor's cpp_wrapper codegen and enables the CUDA tests in test_cpp_wrapper.

This PR also brings in additional required hipify mappings for the wrapper codegen file.

NOTE: we can unskip some of these tests when we enabled MI210 runners.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105141
Approved by: https://github.com/jansel, https://github.com/malfet
2023-11-29 15:11:24 +00:00
Jeff Daily
28c0b07d19 [ROCm] remove HCC references (#111975)
- rename `__HIP_PLATFORM_HCC__` to `__HIP_PLATFORM_AMD__`
- rename `HIP_HCC_FLAGS` to `HIP_CLANG_FLAGS`
- rename `PYTORCH_HIP_HCC_LIBRARIES` to `PYTORCH_HIP_LIBRARIES`
- workaround in tools/amd_build/build_amd.py until submodules are updated

These symbols have had a long deprecation cycle and will finally be removed in ROCm 6.0.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111975
Approved by: https://github.com/ezyang, https://github.com/hongxiayang
2023-10-26 02:39:10 +00:00
wangxiyuan
5589b81173 Remove redundant change for gloo (#106750)
HIP deprecated symbols are removed by d74270ece2 and fe2ad9c328 which is included in pytorch gloo already.

gloo in pytorch master: 597accfd79

There is no need to fix it in pytorch now.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106750
Approved by: https://github.com/jithunnair-amd, https://github.com/kit1980
2023-09-26 03:46:14 +00:00
PyTorch MergeBot
5a7c008b30 Revert "[ROCm] Add ROCm AMDGPU support for inductor cpp codegen (#105141)"
This reverts commit 8ff00360a4.

Reverted https://github.com/pytorch/pytorch/pull/105141 on behalf of https://github.com/DanilBaibak due to Break internal build ([comment](https://github.com/pytorch/pytorch/pull/105141#issuecomment-1715629007))
2023-09-12 12:29:55 +00:00
Jack Taylor
8ff00360a4 [ROCm] Add ROCm AMDGPU support for inductor cpp codegen (#105141)
Follows from previous enablement attempt: https://github.com/pytorch/pytorch/pull/101797

Adds support for hsaco binaries in inductor's cpp_wrapper codegen and enables the CUDA tests in test_cpp_wrapper.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105141
Approved by: https://github.com/jansel
2023-09-09 16:28:56 +00:00
Justin Chu
4cc1745b13 [BE] f-stringify torch/ and scripts (#105538)
This PR is a follow up on the pyupgrade series to convert more strings to use f-strings using `flynt`.

- https://docs.python.org/3/reference/lexical_analysis.html#f-strings
- https://pypi.org/project/flynt/

Command used:

```
flynt torch/ -ll 120
flynt scripts/ -ll 120
flynt tools/ -ll 120
```

and excluded `collect_env.py`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105538
Approved by: https://github.com/ezyang, https://github.com/malfet
2023-07-21 19:35:24 +00:00
Justin Chu
14d87bb5ff [BE] Enable ruff's UP rules and autoformat tools and scripts (#105428)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105428
Approved by: https://github.com/albanD, https://github.com/soulitzer, https://github.com/malfet
2023-07-19 01:24:44 +00:00
BowenBao
60a68477a6 Bump black version to 23.1.0 (#96578)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96578
Approved by: https://github.com/ezyang
2023-03-15 06:27:59 +00:00
jjsjann123
c11b301bcd [NVFUSER] refactor nvfuser build (#89621)
This PR is the first step towards refactors the build for nvfuser in order to have the coegen being a standalone library.

Contents inside this PR:
1. nvfuser code base has been moved to `./nvfuser`, from `./torch/csrc/jit/codegen/cuda/`, except for registration code for integration (interface.h/interface.cpp)
2. splits the build system so nvfuser is generating its own `.so` files. Currently there are:
    - `libnvfuser_codegen.so`, which contains the integration, codegen and runtime system of nvfuser
    - `nvfuser.so`, which is nvfuser's python API via pybind. Python frontend is now exposed via `nvfuser._C.XXX` instead of `torch._C._nvfuser`
3. nvfuser cpp tests is currently being compiled into `nvfuser_tests`
4. cmake is refactored so that:
    - nvfuser now has its own `CMakeLists.txt`, which is under `torch/csrc/jit/codegen/cuda/`.
    - nvfuser backend code is not compiled inside `libtorch_cuda_xxx` any more
    - nvfuser is added as a subdirectory under `./CMakeLists.txt` at the very end after torch is built.
    - since nvfuser has dependency on torch, the registration of nvfuser at runtime is done via dlopen (`at::DynamicLibrary`). This avoids circular dependency in cmake, which will be a nightmare to handle. For details, look at `torch/csrc/jit/codegen/cuda/interface.cpp::LoadingNvfuserLibrary`

Future work that's scoped in following PR:
- Currently since nvfuser codegen has dependency on torch, we need to refactor that out so we can move nvfuser into a submodule and not rely on dlopen to load the library. @malfet
- Since we moved nvfuser into a cmake build, we effectively disabled bazel build for nvfuser. This could impact internal workload at Meta, so we need to put support back. cc'ing @vors

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89621
Approved by: https://github.com/davidberard98
2023-01-26 02:50:44 +00:00
Jeff Daily
d09486ab23 [ROCm] enable nvfuser (#82498)
### Description
The nvfuser is enabled for ROCm.

### Testing
CI label ciflow/trunk covers the newly enabled ROCm functionality as well as any CUDA regressions caused by these changes.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/82498
Approved by: https://github.com/jjsjann123, https://github.com/davidberard98
2022-08-30 21:50:39 +00:00
Xinya Zhang
ec99a8003a [ROCM] Improvements of incremental hipification and build (#82190)
### Description
Improve the incremental build process on ROCM by eliminating unnecessary file changes.

### Issue
N/A

### Testing
1. Run `python tools/amd_build/build_amd.py --out-of-place-only` multiple times, and ensure File `third_party/gloo/cmake/Modules/Findrccl.cmake` does not contain patterns like `RCCL_LIBRARY_PATH_PATH`
2. Run `python tools/amd_build/build_amd.py; USE_ROCM=1 python3 setup.py develop` twice, and confirm the second run does not trigger the compiling of thousands of files.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/82190
Approved by: https://github.com/jithunnair-amd, https://github.com/ezyang
2022-07-27 13:37:40 +00:00
Huy Do
347b036350 Apply ufmt linter to all py files under tools (#81285)
With ufmt in place https://github.com/pytorch/pytorch/pull/81157, we can now use it to gradually format all files. I'm breaking this down into multiple smaller batches to avoid too many merge conflicts later on.

This batch (as copied from the current BLACK linter config):
* `tools/**/*.py`

Upcoming batchs:
* `torchgen/**/*.py`
* `torch/package/**/*.py`
* `torch/onnx/**/*.py`
* `torch/_refs/**/*.py`
* `torch/_prims/**/*.py`
* `torch/_meta_registrations.py`
* `torch/_decomp/**/*.py`
* `test/onnx/**/*.py`

Once they are all formatted, BLACK linter will be removed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/81285
Approved by: https://github.com/suo
2022-07-13 07:59:22 +00:00
PyTorch MergeBot
ec4be38ba9 Revert "To add hipify_torch as a submodule in pytorch/third_party (#74704)"
This reverts commit 93b0fec39d.

Reverted https://github.com/pytorch/pytorch/pull/74704 on behalf of https://github.com/malfet due to broke torchvision
2022-06-21 23:54:00 +00:00
Bhavya Medishetty
93b0fec39d To add hipify_torch as a submodule in pytorch/third_party (#74704)
`hipify_torch` as a submodule in `pytorch/third_party`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74704
Approved by: https://github.com/jeffdaily, https://github.com/malfet
2022-06-21 18:56:49 +00:00
Shintaro Iwasaki
20e4d6c4dc [PyTorch][AMD] fix hipify_python (#76720)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76720

This PR fixes an issue in hipify_python introduced by https://github.com/pytorch/pytorch/pull/76141.

https://github.com/pytorch/pytorch/pull/76141 made all the `includes` paths "absolute", but this was not done for `args.extra_include_dir`; `new_dir`, which is a relative path, is directly added to `includes`. This PR fixes it by passing the absolute path (`abs_new_dir`).

Test Plan: CI

Reviewed By: albanD

Differential Revision: D36089556

fbshipit-source-id: 1607075a4cb13696c1b25923f56b08a8cb3c6578
(cherry picked from commit 2ca648728f01c03320015f90d33404e75f978206)
2022-05-03 22:59:10 +00:00
rraminen
7422ccea8b Hipify fixes for a successful DeepSpeed build
These commits are required to build DeepSpeed on ROCm without the hipify errors.

a41829d9ed
663c718462

cc: @jeffdaily

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76141
Approved by: https://github.com/jeffdaily, https://github.com/pruthvistony, https://github.com/albanD
2022-04-28 13:19:59 +00:00
dzdang
6e292f1a21 [quant][core][gpu][improvement] Integrated quantized cudnn max pool2d with existing quantized_max_pool2d (#76129)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76129

Previously, quantized_max_pool2d_cudnn was made available to the
frontend through torch.ops.quantized.max_pool2d.
We improve the integration by also making it available through
torch.max_pool2d, which is made possible by registering
quantized_max_pool2d_cudnn in native_functions.yaml under
quantized_max_pool2d, which is called in max_pool2d.

Ideally and ultimately, we will get rid of the quantized_max_pool2d
registration in native_functions.yaml, and directly register
quantized_max_pool2d and quantized_max_pool2d_cudnn under max_pool2d,
but current support for quantized dispatch keys blocks us from doing so.

Test Plan:
```
python test/run_tests.py
```

```
python test/run_tests.py
```

Differential Revision:
D35789078
D35789078

Reviewed By: jerryzh168

Pulled By: dzdang

fbshipit-source-id: 5d8220255bfab663b4779b5d3c66dea9f79d8ee7
(cherry picked from commit c27164da29043f7dc9a4c27d24a93cd37162c23e)
2022-04-27 01:52:45 +00:00
Scott Wolchok
e816e17655 [PyTorch] Add native fast path for transformer encoder inference (#76333)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76333

The current PyTorch multi-head attention and transformer
implementations are slow. This should speed them up for inference.
ghstack-source-id: 154737857

(Note: this ignores all push blocking failures!)

Test Plan: CI

Reviewed By: cpuhrsch

Differential Revision: D35239925

fbshipit-source-id: 5a7eb8ff79bc6afb4b7d45075ddb2a24a6e2df28
2022-04-26 12:58:03 -04:00
Jon Janzen
2387efd356 Revert "[PyTorch] Add native fast path for transformer encoder inference"
This reverts commit b369b89f23.

This has internal changes and should not have been landed via mergebot.

Ref: https://github.com/pytorch/pytorch/pull/75809#issuecomment-1108717166
2022-04-25 11:40:02 -04:00
Scott Wolchok
b369b89f23 [PyTorch] Add native fast path for transformer encoder inference
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75809

The current PyTorch multi-head attention and transformer
implementations are slow. This should speed them up for inference.

Differential Revision: [D35239925](https://our.internmc.facebook.com/intern/diff/D35239925/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D35239925/)!

Approved by: https://github.com/ezyang
2022-04-25 06:11:36 +00:00
Edward Z. Yang
a11c1bbdd0 Run Black on all of tools/
Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76089

Approved by: https://github.com/albanD
2022-04-20 17:29:41 +00:00
Scott Wolchok
97c993ca7a [PyTorch] Add NestedTensor support functions for transformers
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75491

Here are the NestedTensor kernels we'll need for the improved transformer implementation.

Differential Revision: [D35409275](https://our.internmc.facebook.com/intern/diff/D35409275/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D35409275/)!

Approved by: https://github.com/cpuhrsch
2022-04-14 16:30:23 +00:00
Xiaodong Wang
025cd69a86 [AMD] Fix some legacy hipify script (#70594)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70594

Pull Request resolved: https://github.com/facebookincubator/gloo/pull/315

Fix some out-dated hipify script:
* python -> python3 (fb internal)
* rocblas return code
* gloo makefile for hip clang

Test Plan: Sandcastle + OSS build

Reviewed By: malfet, shintaro-iwasaki

Differential Revision: D33402839

fbshipit-source-id: 5893039451bcf77bbbb1b88d2e46ae3e39caa154
2022-01-05 11:34:25 -08:00
Peter Bell
560cd88195 Kill THCUNN (#63429)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63429

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D30441308

Pulled By: ngimel

fbshipit-source-id: 3ae342a2f8d5c7f8827b637c4055c5d1b0a1be26
2021-08-23 12:07:16 -07:00
Sam Estep
737d920b21 Strictly type everything in .github and tools (#59117)
Summary:
This PR greatly simplifies `mypy-strict.ini` by strictly typing everything in `.github` and `tools`, rather than picking and choosing only specific files in those two dirs. It also removes `warn_unused_ignores` from `mypy-strict.ini`, for reasons described in https://github.com/pytorch/pytorch/pull/56402#issuecomment-822743795: basically, that setting makes life more difficult depending on what libraries you have installed locally vs in CI (e.g. `ruamel`).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59117

Test Plan:
```
flake8
mypy --config mypy-strict.ini
```

Reviewed By: malfet

Differential Revision: D28765386

Pulled By: samestep

fbshipit-source-id: 3e744e301c7a464f8a2a2428fcdbad534e231f2e
2021-06-07 14:49:36 -07:00
Jeff Daily
ba694520e5 [ROCm] fix JIT codegen (#57400)
Summary:
Fixes upcoming changes that are part of ROCm 4.2 and affect PyTorch JIT.

- ROCM_VERSION macro must be available to both device and host compilation passes.
- Unifies some of CUDA and HIP differences in the code generated.
  - NAN / POS_INFINITY / NEG_INFINITY
  - Do not hipify `extern __shared__` -> `HIP_DYNAMIC_SHARED()` macro [deprecated]
- Differentiates bf16 codegen for HIP.
- Optionally provides missing macros when using hiprtc precompiled header feature.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/57400

Reviewed By: ejguan

Differential Revision: D28421065

Pulled By: malfet

fbshipit-source-id: 215f476773c61d8b0d9d148a4e5f5d016f863074
2021-05-27 11:45:07 -07:00
Sam Estep
2e26976ad3 Disallow versionless Python shebangs (#58275)
Summary:
Some machines don't have a versionless `python` on their PATH, which breaks these existing shebangs.

I'm assuming that all the existing versionless `python` shebangs are meant to be `python3` and not `python2`; please let me know if my assumption was incorrect for any of these.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/58275

Test Plan: CI.

Reviewed By: zhouzhuojie

Differential Revision: D28428143

Pulled By: samestep

fbshipit-source-id: 6562be3d12924db72a92a0207b060ef740f61ebf
2021-05-14 08:26:02 -07:00
Jeff Daily
b2e5617553 [ROCm] rename HIP_HCC_FLAGS to HIP_CLANG_FLAGS (#50917)
Summary:
ROCm 3.5 replaced hcc with hip-clang and deprecated HIP_HCC_FLAGS.
HIP_CLANG_FLAGS should be used moving forward. HIP_HCC_FLAGS will
be removed soon.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50917

Reviewed By: ejguan

Differential Revision: D26008094

Pulled By: walterddr

fbshipit-source-id: cfec4f96fbd9bd338834a841c37267f6a4703cab
2021-01-22 07:24:05 -08:00
Jithun Nair
45ec35827e Set USE_RCCL cmake option (dependent on USE_NCCL) [REDUX] (#34683)
Summary:
Refiled duplicate of https://github.com/pytorch/pytorch/issues/31341 which was reverted in commit 63964175b5.

This PR enables RCCL support when building Gloo as part of PyTorch for ROCm.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/34683

Reviewed By: glaringlee

Differential Revision: D25540578

Pulled By: ezyang

fbshipit-source-id: fcb02e5745d62e1b7d2e02048160e9e7a4b4df2d
2021-01-06 07:03:02 -08:00
Bugra Akyildiz
27c7158166 Remove __future__ imports for legacy Python2 supports (#45033)
Summary:
There is a module called `2to3` which you can target for future specifically to remove these, the directory of `caffe2` has the most redundant imports:

```2to3 -f future -w caffe2```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45033

Reviewed By: seemethere

Differential Revision: D23808648

Pulled By: bugra

fbshipit-source-id: 38971900f0fe43ab44a9168e57f2307580d36a38
2020-09-23 17:57:02 -07:00
Jeff Daily
5152633258 [ROCm] update hip library name (#41813)
Summary:
With transition to hipclang, the HIP runtime library name was changed.  A symlink was added to ease the transition, but is going to be removed.  Conditionally set library name based on HIP compiler used.  Patch gloo submodule as part of build_amd.py script until its associated fix is available.

CC ezyang xw285cornell sunway513

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41813

Reviewed By: zhangguanheng66

Differential Revision: D22660077

Pulled By: xw285cornell

fbshipit-source-id: c538129268d9947535b34523201f655b13c9e0a3
2020-07-22 09:42:45 -07:00
Edward Yang
b4aceb3884 Fix lint (#39527)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39527

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D21884798

Pulled By: ezyang

fbshipit-source-id: a130bfd4cc122ea1d45e7db7303bf44e04f08703
2020-06-04 10:30:44 -07:00
Jithun Nair
af91df68ed Remove cuda init patch (#39222)
Summary:
The below lines have been removed from `torch/cuda/__init__.py` anyway:
```
        _cudart = _load_cudart()
        _cudart.cudaGetErrorName.restype = ctypes.c_char_p
        _cudart.cudaGetErrorString.restype = ctypes.c_char_p
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39222

Differential Revision: D21864397

Pulled By: yns88

fbshipit-source-id: 941b13f92192f930e1dfa4b385e1aec2e321e75f
2020-06-04 09:31:34 -07:00
Xiaodong Wang
36b73d5a1b Hipify contrib/nccl (#29385)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29385

hipify contrib/gloo

Test Plan: OSS & sandcastle build

Reviewed By: bddppq

Differential Revision: D18373308

fbshipit-source-id: 39c232db36318af116c341f64d03642639575ecd
2019-11-08 10:39:17 -08:00
Hong Xu
987e37b9c2 Enable EXE001 flake8 check. (#27560)
Summary:
According to https://github.com/pytorch/pytorch/issues/27285 , seems we do not intend to use shebang as an indication of Python version, thus
we enable EXE001 flake8 check.
For violations, we either remove shebang from non-executable Python scripts or grant them executable permission.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27560

Differential Revision: D17831782

Pulled By: ezyang

fbshipit-source-id: 6282fd3617b25676a6d959af0d318faf05c09b26
2019-10-09 09:15:29 -07:00
Your Name
4bd8ae13c6 Move hipify to torch/utils to bundle them into torch package (#27425)
Summary:
Similar to https://github.com/pytorch/pytorch/pull/27418 but try to put it under "torch" namespace
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27425

Differential Revision: D17779490

Pulled By: bddppq

fbshipit-source-id: 688338d143509b37dfc110df17af3331db48a42b
2019-10-07 17:25:45 -07:00
Junjie Bai
3c2cd8cc10 Some hipify script cleanups (#27375)
Summary:
continue https://github.com/pytorch/pytorch/issues/26363
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27375

Differential Revision: D17764992

Pulled By: bddppq

fbshipit-source-id: ecc06521179677efcedb1d58ceda63df7d63627e
2019-10-04 14:43:22 -07:00
Johannes M Dieterich
fc36842554 Improve hip-clang support in build_amd.py (#23835)
Summary:
Use the supported way to differentiate and automatically switch between hip-clang and hcc hipification in build_amd.py.

Cleaned up from PR https://github.com/pytorch/pytorch/issues/23699
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23835

Differential Revision: D16659661

Pulled By: vincentqb

fbshipit-source-id: 05a4250ceb28beda7a7bf73a46c5dc46f6e852bc
2019-08-07 07:49:07 -07:00
Edward Yang
4050de5b58 Revert D16627326: [pytorch][PR] [ROCm] Improve hip-clang support in build_amd.py
Differential Revision:
D16627326

Original commit changeset: 977003174395

fbshipit-source-id: d26959c85d74ce8b81341a31c9ddb2260bf18c9b
2019-08-05 15:04:47 -07:00
Yaxun (Sam) Liu
f0a581801a Improve hip-clang support in build_amd.py (#23699)
Summary:
Use the supported way to differentiate and automatically switch between hip-clang and hcc hipification in build_amd.py.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23699

Differential Revision: D16627326

Pulled By: vincentqb

fbshipit-source-id: 977003174395fb69cf0c96c89232bd6214780cd8
2019-08-05 13:39:28 -07:00
Jerry Zhang
f7de9be3c0 Add FakeQuantize Module (#21767)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21767

Adding FakeQuantize Module
for quantization aware training

Reviewed By: dzhulgakov

Differential Revision: D15728503

fbshipit-source-id: 2a9a6a362812ede3deac42b93dddca35987bd8e6
2019-07-15 14:08:55 -07:00
Xiaodong Wang
76713fb564 Fix remote build + clean up disable feature hack (#21816)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21816

Clean up disable feature hack.

Reviewed By: bddppq

Differential Revision: D15833285

fbshipit-source-id: a2ae5d0f15e47b835dbd3997bbaa0add7e868f20
2019-06-17 08:08:34 -07:00
Xiaodong Wang
f3d827f311 Hipify fb/quantize
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20725

Reviewed By: bddppq

Differential Revision: D15407710

fbshipit-source-id: e5fdeee7e2dffd43cfdd6fab6193eb8a80902c02
2019-05-21 10:51:36 -07:00
Junjie Bai
bc5398451e Enable ROCm multi-gpu with Gloo
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18640

Differential Revision: D15185822

Pulled By: bddppq

fbshipit-source-id: 1b49ab3fb0f251cfc7ef3ddd62033ae0065a4ec3
2019-05-07 09:55:47 -07:00
Xiaodong Wang
9d0b5a1ce9 Build caffe2/fb/operators (#19688)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19688

Minor changes to hipify script to take extra folders.

Reviewed By: bddppq

Differential Revision: D15068427

fbshipit-source-id: e2e792c8227cbd0e15fd2564f87d740a62c477da
2019-04-29 09:01:10 -07:00
Edward Yang
173f224570 Turn on F401: Unused import warning. (#18598)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18598
ghimport-source-id: c74597e5e7437e94a43c163cee0639b20d0d0c6a

Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18598 Turn on F401: Unused import warning.**

This was requested by someone at Facebook; this lint is turned
on for Facebook by default.  "Sure, why not."

I had to noqa a number of imports in __init__.  Hypothetically
we're supposed to use __all__ in this case, but I was too lazy
to fix it.  Left for future work.

Be careful!  flake8-2 and flake8-3 behave differently with
respect to import resolution for # type: comments.  flake8-3 will
report an import unused; flake8-2 will not.  For now, I just
noqa'd all these sites.

All the changes were done by hand.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: D14687478

fbshipit-source-id: 30d532381e914091aadfa0d2a5a89404819663e3
2019-03-30 09:01:17 -07:00