pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Edward Yang	173f224570	Turn on F401: Unused import warning. (#18598 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18598 ghimport-source-id: c74597e5e7437e94a43c163cee0639b20d0d0c6a Stack from [ghstack](https://github.com/ezyang/ghstack): * #18598 Turn on F401: Unused import warning. This was requested by someone at Facebook; this lint is turned on for Facebook by default. "Sure, why not." I had to noqa a number of imports in __init__. Hypothetically we're supposed to use __all__ in this case, but I was too lazy to fix it. Left for future work. Be careful! flake8-2 and flake8-3 behave differently with respect to import resolution for # type: comments. flake8-3 will report an import unused; flake8-2 will not. For now, I just noqa'd all these sites. All the changes were done by hand. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14687478 fbshipit-source-id: 30d532381e914091aadfa0d2a5a89404819663e3	2019-03-30 09:01:17 -07:00
Xiaodong Wang	62d8c8cf0a	Manual hipify caffe2/distributed and rocm update (no hcc modules support) (#18088 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18088 Manually hipify the distributed folder Reviewed By: bddppq Differential Revision: D14482702 fbshipit-source-id: cc0abdf525b423ab1f18db8010d21e27c6668d36	2019-03-29 11:07:32 -07:00
bddppq	1989716ae5	Resubmit PR-18512: Improved onnx export for 3 onnx ops (#18571 ) Summary: Fix ROCm CI failure Pull Request resolved: https://github.com/pytorch/pytorch/pull/18571 Differential Revision: D14669323 Pulled By: bddppq fbshipit-source-id: 022afe5c20e680295c9cfdfe1ec14650305955a8	2019-03-28 18:12:49 -07:00
Sandeep Kumar	6248266d91	Enable detectron on AMD GPU Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17862 Differential Revision: D14429234 Pulled By: bddppq fbshipit-source-id: 5cb8750bd9db0ff8a179977d2bfbb180265cce81	2019-03-12 16:29:42 -07:00
Johannes M Dieterich	288e1fbd18	Add '--hip-clang-launch' to favor <<<>>>-based launch. (#17686 ) Summary: hip-clang uses triple chevron kernel dispatch syntax. Add an option to the hipification script to skip translating triple chevron to hipLaunchKernelGGL. Once we switch to hip-clang, this option will be default and subsequently removed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17686 Differential Revision: D14327810 Pulled By: bddppq fbshipit-source-id: 5e1512325077dd3ebb8fb9b5bf35fd1f8d9a4dc3	2019-03-05 12:52:22 -08:00
Jithun Nair	06c8aa7a3b	Hipify fixes for Masquerade logic (#17598 ) Summary: ezyang Please review. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17598 Differential Revision: D14287724 Pulled By: ezyang fbshipit-source-id: 46e5083854a827370bb4c81b82e5a4ede511e473	2019-03-01 15:13:19 -08:00
Edward Yang	c9989dfe37	Make HIPStream also masquerade as CUDA. (#17469 ) Summary: HIPGuard interfaces that interacted with HIPStream were previously totally busted (because the streams had the wrong device type). This fixes it, following along the same lines of MasqueardingAsCUDA. Along the way I beefed up the explanatory comment. Signed-off-by: Edward Z. Yang <ezyang@fb.com> cc jithunnair-amd iotamudelta bddppq Pull Request resolved: https://github.com/pytorch/pytorch/pull/17469 Differential Revision: D14243396 Pulled By: ezyang fbshipit-source-id: 972455753a62f8584ba9ab194f9c785db7bb9bde	2019-02-28 13:46:11 -08:00
Zachary DeVito	356a94b64e	Lazily load libcuda libnvrtc from c++ (#17317 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/16860 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17317 Differential Revision: D14157877 Pulled By: zdevito fbshipit-source-id: c37aec2d77c2e637d4fc6ceffe2bd32901c70317	2019-02-22 13:51:45 -08:00
Johannes M Dieterich	f84165d20d	Remove static_cast insertion/kernel argument extration. (#17055 ) Summary: In light of the antistatic feature being a part of the released ROCm 2.1, remove the feature in pyHIPIFY for extraction of kernel arguments and insertion of static_casts. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17055 Differential Revision: D14068478 Pulled By: bddppq fbshipit-source-id: 6895f490c78247a129aa18c520ff8d4d1a3d3642	2019-02-15 01:54:31 -08:00
Xiaodong Wang	6f2bcc9b4f	Caffe2 TARGETS for HIP (#17076 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17076 OSS: slightely change the tools/amd_build/build_amd.py to add the output_directory for internal use. Also modify the renaming convention in hipify script to reflect the updated rules. Reviewed By: bddppq Differential Revision: D13767218 fbshipit-source-id: cbcadc51daab42197d545f204840dcc18176bb3d	2019-02-14 15:45:21 -08:00
Xiaomeng Yang	7d4a81cbb2	Use macro for reduce on 2d blocks (#16344 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16344 Use macro for reduce on 2d blocks i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D13808988 fbshipit-source-id: b68c0fb6079c1b6e203a072083aba7a95c202bc2	2019-02-01 23:49:07 -08:00
Zachary DeVito	21193bf123	try to get rid of tmp_install (#16414 ) Summary: Rehash of previous attempts. This tries a different approach where we accept the install as specified in cmake (leaving bin/ include/ and lib/ alone), and then try to adjust the rest of the files to this more standard layout. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16414 Differential Revision: D13863635 Pulled By: zdevito fbshipit-source-id: 23725f5c64d7509bf3ca8f472dcdcad074de9828	2019-01-29 17:29:40 -08:00
Xiaomeng Yang	0a2d14dd7c	Optimize SpatialBNOp on GPU (#16395 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16395 Optimize SpatialBNOp on GPU i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D13829833 fbshipit-source-id: 04d2a63e8e9830c4c39a91cf87fcd7aa765dc55f	2019-01-28 09:36:45 -08:00
Edward Yang	e936a69085	Move THCCachingAllocator to c10_cuda. (#16119 ) Summary: Some renaming and renamespacing also took place. I was originally planning not to do anything, but it turns out that it was easier to make HIPify work by using a namespace CUDACachingAllocator:: rather than THCCachingAllocator_, since :: is a word boundary but _ is not. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16119 Reviewed By: smessmer Differential Revision: D13718768 fbshipit-source-id: 884a481d99027fd3e34471c020f826aa12225656	2019-01-24 12:06:56 -08:00
Mickaël Schoentgen	04f5605ba1	Fix several DeprecationWarning: invalid escape sequence (#15733 ) Summary: Hello, This is a little patch to fix `DeprecationWarning: invalid escape sequence`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15733 Differential Revision: D13587291 Pulled By: soumith fbshipit-source-id: ce68db2de92ca7eaa42f78ca5ae6fbc1d4d90e05	2019-01-05 08:53:35 -08:00
hbraun@nvidia.com	3fdf567752	Adding CUDA version for C2 operators generate proposals and nms (#13694 ) Summary: Related to issue #13684 Pull Request resolved: https://github.com/pytorch/pytorch/pull/13694 Reviewed By: wat3rBro Differential Revision: D13017791 Pulled By: newstzpz fbshipit-source-id: 4bdc58e474d8e1f6cd73a02bf51f91542a2b9d0b	2018-12-20 14:39:09 -08:00
bddppq	34f1f2208b	Build c10 HIP test Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15233 Reviewed By: ezyang Differential Revision: D13471002 Pulled By: bddppq fbshipit-source-id: b42c3bc2b9db672ce50a52eb700cc6ed13d3535f	2018-12-14 15:36:38 -08:00
Johannes M Dieterich	b316e44a46	Remove __forceinline__ hipification step. (#15229 ) Summary: The HIP definition now correctly contains the inline attribute. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15229 Differential Revision: D13470962 Pulled By: bddppq fbshipit-source-id: 34f8361bda5f3dce20a2eeb530c3a25d1b1bdd06	2018-12-14 14:24:05 -08:00
bddppq	de0784510d	Remove disabled_features in hipify Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15098 Reviewed By: ezyang Differential Revision: D13453762 Pulled By: bddppq fbshipit-source-id: e177042c78f5bf393163d660c25b80285353853d	2018-12-13 15:43:57 -08:00
Edward Yang	2d485ffb17	Move CUDAGuard, CUDAStream and CUDAGuardImpl to c10/cuda (#14248 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14248 This diff also introduces a horrifying hack to override CUDA's DeviceGuardImpl with a HIPGuardImplMasqueradingAsCUDA, to accommodate PyTorch's current behavior of pretending CUDA is HIP when you build with ROCm enabled. Reviewed By: bddppq Differential Revision: D13145293 fbshipit-source-id: ee0e207b6fd132f0d435512957424a002d588f02	2018-12-12 11:24:26 -08:00
bddppq	479481b6cb	Remove linker and dlopen flags that allowed undefined symbols in rocm build (#15091 ) Summary: Previously the undefined symbols were caused by disabled_modules in tools/amd_build/disabled_features.json (now it's cleared). Pull Request resolved: https://github.com/pytorch/pytorch/pull/15091 Differential Revision: D13429595 Pulled By: bddppq fbshipit-source-id: b341e83f9e5a8d16440a364e837b045a8a4fd6e1	2018-12-11 23:23:47 -08:00
Edward Yang	b710642969	Make ATen HIPify out-of-place, but still reuse CUDA names. (#14866 ) Summary: ``` This diff changes the HIPification of ATen to be out-of-place. We now have the following mappings: - ATen/cuda => ATen/hip - ATen/native/cuda => ATen/native/hip - ATen/native/sparse/cuda => ATen/native/sparse/hip - THC => THH - THCUNN => THHUNN The build system is adjusted to know about these new build paths, and HIPify is taught how to adjust include paths and THC_GENERIC_FILE appropriately. ATen_hip is now built as the ATen_hip library, rather than reusing ATen_cuda. However, despite these new filepaths, none of the identifiers in ATen have actually changed. So, e.g., THHGeneral.h still defines functions named THC_blahblah, and HIP still shows up as CUDA in PyTorch itself. We'll tackle this in a subsequent PR; this diff is just to get the files out-of-place. Minor extra improvements: - Don't edit tmp_install when hipifying - HIP no longer builds native_cudnn_cpp; it was unnecessary - Caffe2_HIP_INCLUDES is now Caffe2_HIP_INCLUDE, for consistency with all the other variables. - HIP build now properly respects ATEN_CUDA_FILES_GEN_LIB (it did not previously.) - You can now override file extension matching in pyHIPIFY by explicitly specifying its full name in the matching list. This is used so we can HIPify CMakeLists.txt in some situations. A little bit of string and ceiling wax: - gen.py grows a --rocm flag so that it knows to generate CUDA files which actually refer to the HIP headers (e.g., THH.h) We'll get rid of this eventually and generate real HIP files, but not for this PR. - Management of HIP dependencies is now completely deleted from the ATen CMakeLists.txt. The old code was dead (because it was shoveled in ATen_CUDA_DEPENDENCY_LIBS and promptly ignored by the Caffe2 build system) and didn't actually work. ``` Stacked on https://github.com/pytorch/pytorch/pull/14849 review last commit only Pull Request resolved: https://github.com/pytorch/pytorch/pull/14866 Differential Revision: D13419475 Pulled By: ezyang fbshipit-source-id: cb4c843df69a1d8369314c9fab1b7719520fa3db	2018-12-11 19:15:27 -08:00
rohithkrn	7e2b074219	Integrate rocBLAS fp16 api into Caffe2 (#14882 ) Summary: This PR integrates rocBLAS half and mixed precision APIs in to Caffe2. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14882 Differential Revision: D13407840 Pulled By: bddppq fbshipit-source-id: 75cb0d74da066776fa66575f1d255e879d36121e	2018-12-10 17:54:06 -08:00
Edward Yang	23cc3daabd	Disable getNumGPUs rewrite (#14993 ) Summary: cc iotamudelta Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/14993 Differential Revision: D13405804 Pulled By: ezyang fbshipit-source-id: c4aa9ed29ee2a4f3abf76c1e0fa8babfd738db35	2018-12-10 15:13:55 -08:00
Edward Yang	66315ab323	Stop disabling maybeOverlappingIndices (#14999 ) Summary: Signed-off-by: Edward Z. Yang <ezyang@fb.com> cc iotamudelta Pull Request resolved: https://github.com/pytorch/pytorch/pull/14999 Differential Revision: D13405754 Pulled By: ezyang fbshipit-source-id: 98459496494390ad1115b4f1f6738d53c14f0745	2018-12-10 15:02:08 -08:00
Junjie Bai	6651fae827	Make autograd engine compatible with hip Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14873 Differential Revision: D13375053 Pulled By: bddppq fbshipit-source-id: f3051640386667bbf0566856ed433eb83276c39e	2018-12-07 00:12:06 -08:00
Junjie Bai	f82f4de229	Stop inserting static casts in Hipify (#14853 ) Summary: Latest hcc can now properly cast to correct type internally, so there is no need to insert static_cast in hipify scripts anymore. However the hcc included in the latest ROCm release (1.9.2) doesn't have this fix, so leaving a flag to continue doing static_cast for those using the official ROCm releases. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14853 Differential Revision: D13363171 Pulled By: bddppq fbshipit-source-id: a36476a8511222ff3c933d31788e8a0ffb04f5ca	2018-12-06 13:19:33 -08:00
Edward Yang	f9446e0c94	HIPify less files in PyTorch (#14804 ) Summary: Stacked on #14803 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14804 Differential Revision: D13347986 Pulled By: ezyang fbshipit-source-id: c93177b4ad51855660d0de36d042bfc542bd4be0	2018-12-05 20:52:38 -08:00
Edward Yang	999690ff3d	Improve HIPify performance (#14803 ) Summary: ``` Improve performance of pyHIPIFY Changes: - Pre-compile regexes, don't use regexes when it's not necessary (this saves us ~15%) - Compile all substitutions for mappings into a single, non-backtracking regex using a Trie. This gives big savings. Before, running pyHIPIFY on all files took 15.8s. Now it takes 3.9s. ``` Stacked on #14769 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14803 Differential Revision: D13342620 Pulled By: ezyang fbshipit-source-id: 1cfa36b3236bbe24d07080a31cc788a52d740f40	2018-12-05 11:00:03 -08:00
Edward Yang	62f4db6d8a	Unify build_caffe2_amd.py and build_pytorch_amd.py (#14769 ) Summary: I need to preserve ability to HIPify out-of-place files only, so build_amd.py grows a --out-of-place-only flag. Stacked on #14757 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14769 Differential Revision: D13340154 Pulled By: ezyang fbshipit-source-id: 1b855bc79e824ea94517a893236fd2c8ba4cb79d	2018-12-05 09:26:12 -08:00
Edward Yang	e829a52977	Remove use of hipify_caffe2, in favor of file path test. (#14757 ) Summary: This is towards unifying build_pytorch_amd.py and build_caffe2_amd.py scripts. There is only one use of hipify_caffe2 left, which is just to control which files actually get HIPified. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/14757 Differential Revision: D13323486 Pulled By: ezyang fbshipit-source-id: 958cd91be32dfc3c0a9ba9eda507adb5937aebcd	2018-12-04 12:48:49 -08:00
Michael Antonov	773f4d8081	Implements Gather operator for arbitrary axis, sharing the code with BatchGather. (#13756 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13756 This implements general Gather operator for arbitrary axis, sharing the code with BatchGather. - CPU gather & batch gather logic is now shared through caffe2::gather_helper, for any axis. - Shared CUDA kernel moved to gather_op.cuh, for any axis. - Gradients of axis > 0 delegate to BatchGatherGradientOp which now has axis argument. - BatchGatherOp doc strings updated to have correct rank (q + (r -1)) and output. - Added tests for axis == 2. GatherOp supports index wrapping for axis == 0 by default, which was earlier for ONNX. This diff also extends it to work in Cuda kernel. Added "wrap_indices" argument which specifies wheather this wrapping should be done; set it to true if you'd like wrapping for any axis. TBD: Update gradients to support negative indices (separate diff). TBD: Once we have operator versioning, we'd like to update GatherOp to NOT support axis 0 wrapping by default, but rather do it only if wrap_indices is set. Reviewed By: dzhulgakov Differential Revision: D12983815 fbshipit-source-id: 8add9d67b47fe8c5ba7a335f581ca0530b205cd7	2018-12-04 11:54:28 -08:00
Edward Yang	507cb16583	Delete OPENMP_STUB translation. (#14286 ) Summary: Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/14286 Differential Revision: D13205356 Pulled By: ezyang fbshipit-source-id: 08e9821e4b32f8d7f3c41906e481f280ee6cf2e3	2018-11-26 19:08:07 -08:00
Edward Yang	e58bbbac18	Delete dependencies from CUDAStream; remove synchronize_with (#13920 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13920 I want to move CUDAStream and CUDAGuard to c10_cuda without also bringing along CUDAContext or CUDAEvent for the ride (at least for now). To do this, I need to eliminate those dependencies. There's a few functions in CUDAContext.h which don't really need THCState, so they're separated out and put in general purpose c10/cuda/CUDAFunctions.h Reviewed By: smessmer Differential Revision: D13047468 fbshipit-source-id: 7ed9d5e660f95805ab39d7af25892327edae050e	2018-11-19 17:05:41 -08:00
Junjie Bai	7fd1ea6ab7	Cleanup caffe2 hipify exclude patterns (#14198 ) Summary: depthwise_3x3_conv_op.cu does not exist Pull Request resolved: https://github.com/pytorch/pytorch/pull/14198 Differential Revision: D13127479 Pulled By: bddppq fbshipit-source-id: ec6bd434055a49ea405c4b399bde8c074114f955	2018-11-19 14:27:56 -08:00
Edward Yang	48099c23b4	Move AT_CUDA_CHECK to c10 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13910 Reviewed By: smessmer Differential Revision: D13046201 fbshipit-source-id: 8d360a0e4d6c2edf070d130e600c6b04f0ee0058	2018-11-19 08:20:10 -08:00
Junjie Bai	0d7a986da1	Change hip filename extension to .hip (#14036 ) Summary: xw285cornell - To make hip files to have unique filename extension we change hip files from _hip.cc to .hip (it's the only blessing option other than .cu in hipcc `3d51a1fb01/bin/hipcc (L552)`). - Change to use host compiler to compile .cc\|.cpp files. Previously we use hcc to compile them which is unnecessary - Change the hipify script to not replace "gpu" with "hip" in the filename of the generated hipified files. Previously we do this because hcc has a bug when linking files that have same filename. We have now changed to use host linker to do linking so this is unnecessary anymore. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14036 Reviewed By: xw285cornell Differential Revision: D13091813 Pulled By: bddppq fbshipit-source-id: ea3d887751d8abb39d75f5d5104aa66ce66b9ee0	2018-11-16 11:55:59 -08:00
Edward Yang	fed8d8975a	Various improvements to hipify_python.py (#13973 ) Summary: - Speed up hipify_python.py by blacklisting useless (and quite large) directory trees that it would otherwise recurse into - Pass around relative paths instead of absolute paths. This makes it easier to do filename matches based on the root of the tree. - Redo the streaming output to contain more useful information - Make it handle c10/cuda correctly, rewrite c10::cuda to c10::hip, and the header name from CUDAMathCompat.h to CUDAHIPCompat.h Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/13973 Differential Revision: D13062374 Pulled By: ezyang fbshipit-source-id: f0858dd18c94d449ff5dbadc22534c695dc0f8fb	2018-11-14 17:11:24 -08:00
Johannes M Dieterich	53a3c46950	Switch to packaged Thrust on Ubuntu, enable CentOS 7.5 as a CI target (#12899 ) Summary: 1) Use the hip-thrust version of Thrust as opposed to the GH master. (ROCm 267) 2) CentOS 7.5 docker (ROCm 279) * Always install the libraries at docker creation for ubuntu. * Add Dockerfile for CentOS ROCm * Enable the centos build * Source devtoolset in bashrc * Set locales correctly depending on whether we are on Ubuntu or CentOS * Install a newer cmake for CentOS * Checkout thrust as there is no package for CentOS yet. PyTorch/Caffe2 on ROCm passed tests: https://github.com/ROCmSoftwarePlatform/pytorch/pull/280 For attention: bddppq ezyang Docker rebuild for Ubuntu not urgent (getting rid of Thrust checkout and package install is mainly cosmetic). If docker for CentOS 7.5 is wanted, build is necessary. Build of PyTorch tested by me in CentOS docker. PyTorch unit tests work mostly, however, a test in test_jit causes a python recursion error that seems to be due to the python2 on CentOS as we haven't ever seen this on Ubuntu - hence please do not enable unit tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12899 Differential Revision: D13029424 Pulled By: bddppq fbshipit-source-id: 1ca8f4337ec6a603f2742fc81046d5b8f8717c76	2018-11-12 14:39:54 -08:00
Ashish	5ae3b44255	Added HIP top_k operator (#13747 ) Summary: This PR contains changes for: 1. Adding HIP top_k operator in Caffe2 2. Added HIP equivalent definitions of GPUDefs and GPUScanUtils 3. Removing the top_k operator test from ROCm test ignore list 4. Bug fixes in related code in THC/THCAsmUtils.cuh Differential Revision: D12986451 Pulled By: bddppq fbshipit-source-id: 6d5241fb674eaeb7cde42166426ac88043b83504	2018-11-08 20:14:53 -08:00
rohithkrn	afc7dbd586	Hipify caffe2/utils/math_gpu.cu (#13521 ) Summary: This PR adds caffe2/utils/math_gpu.cu to pyHipify bddppq petrex Pull Request resolved: https://github.com/pytorch/pytorch/pull/13521 Differential Revision: D12954843 Pulled By: bddppq fbshipit-source-id: a2bf367da07e49cb7807ba6876b42d0733fc8205	2018-11-07 11:34:15 -08:00
Junjie Bai	a1ba29a2c0	Change to use json format to store disabled_features in hipify (#13595 ) Summary: Since json is a builtin module in Python (>= 2.6), this makes pyhipify can be invoked without installing any extra dependencies. petrex iotamudelta Pull Request resolved: https://github.com/pytorch/pytorch/pull/13595 Differential Revision: D12931045 Pulled By: bddppq fbshipit-source-id: 31d68fb6e730fd9d11593550ca531423cb0596e9	2018-11-06 22:06:10 -08:00
Junjie Bai	95ca66763d	Add math functions overloaded over different numeric types for cuda and hip (#13602 ) Summary: petrex ashishfarmer rohithkrn iotamudelta Pull Request resolved: https://github.com/pytorch/pytorch/pull/13602 Reviewed By: dzhulgakov Differential Revision: D12935797 Pulled By: bddppq fbshipit-source-id: a49ec66fb60bfd947c63dd2133d431884df62235	2018-11-06 01:40:31 -08:00
Junjie Bai	86e1009497	Make ATen core HIP compatible (#13343 ) Summary: So caffe2 can include aten core files without hipifying aten cc xw285cornell Pull Request resolved: https://github.com/pytorch/pytorch/pull/13343 Reviewed By: xw285cornell Differential Revision: D12853162 Pulled By: bddppq fbshipit-source-id: f9402691292180dde110a58ea3b1cedc62aab0ba	2018-10-31 21:08:54 -07:00
Junjie Bai	3c66520dd8	Remove aten/src/ATen/CUDAStream.cpp from hipify script (#13357 ) Summary: Deleted in https://github.com/pytorch/pytorch/pull/13251 Pull Request resolved: https://github.com/pytorch/pytorch/pull/13357 Differential Revision: D12852983 Pulled By: bddppq fbshipit-source-id: 0816a14188590e1971fabefcd575489c7339e122	2018-10-30 19:48:07 -07:00
Xiaodong Wang	ed60f94dba	hipify caffe2 script in fbcode (#13265 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13265 Make changes to make hipify_python script to work with fbcode. 1. Add TARGETS file 2. Make hipify_python a module as well as a standalone script. Reviewed By: bddppq Differential Revision: D10851216 fbshipit-source-id: cacd04df6fe2084832256d1916d62dccea86baa9	2018-10-30 15:51:28 -07:00
Junjie Bai	883da952be	Hipify caffe2/core (#13148 ) Summary: petrex ashishfarmer iotamudelta Pull Request resolved: https://github.com/pytorch/pytorch/pull/13148 Reviewed By: xw285cornell Differential Revision: D10862276 Pulled By: bddppq fbshipit-source-id: 1754834ec50f7dd2f752780e20b2a9cf19d03fc4	2018-10-26 15:27:32 -07:00
iotamudelta	b8a11cffdb	Minor improvements cherry-pick (#12973 ) Summary: * Enable disabled functions for ROCm (ROCm 252) * fixes for topk fp16 (ROCm 270) * HIP needs kernel invocation to be explicitly templated to be able to take non-const arg as const kernel arg (ROCm 281) For attention: bddppq ezyang Full set of PyTorch/Caffe2 tests on ROCm here: https://github.com/ROCmSoftwarePlatform/pytorch/pull/283 Pull Request resolved: https://github.com/pytorch/pytorch/pull/12973 Differential Revision: D10516072 Pulled By: bddppq fbshipit-source-id: 833b3de1544dfa4886a34e2b5ea53d77b6f0ba9e	2018-10-23 15:03:47 -07:00
iotamudelta	470e766062	Fix illegal code in rocblas_handle rocblas_handle() that causes failure w/ gcc as base compiler (#12957 ) Summary: The legal function cublasHandle_t cublas_handle() was hipified to the clearly illegal rocblas_handle rocblas_handle(). It should not work and correctly fails with gcc as the host compiler as it induces an ambiguity. Function now hipifies to rocblas_handle rocblashandle() Fixes long standing issue we've observed in PyTorch when base compiler is gcc. For attention: bddppq ezyang Tests on ROCm PyTorch/Caffe2: https://github.com/ROCmSoftwarePlatform/pytorch/pull/284 Pull Request resolved: https://github.com/pytorch/pytorch/pull/12957 Differential Revision: D10501227 Pulled By: bddppq fbshipit-source-id: 568cb80801c0d14c9b1b61e3a7db387a5c21acf4	2018-10-23 13:46:15 -07:00
Junjie Bai	89010d60f9	Migrate HIP to use DeviceOption.device_id and delete DeviceOption.hip_gpu_id Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12546 Reviewed By: hyuen, xw285cornell Differential Revision: D10305222 fbshipit-source-id: 955e1d2878508a25fe4e9980ae66f8f54aaf7db9	2018-10-10 18:25:06 -07:00

1 2

86 Commits