pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
PyTorch MergeBot	084c4aa614	Revert "Reapply "Delete TorchScript based Android demo app and point to ExecuTorch (#153633 )" (#153656 )" This reverts commit `7ed377f577`. Reverted https://github.com/pytorch/pytorch/pull/153656 on behalf of https://github.com/larryliu0820 due to Still being used internally so can't remove ([comment](https://github.com/pytorch/pytorch/pull/153656#issuecomment-2887665403))	2025-05-16 21:00:11 +00:00
Mengwei Liu	7ed377f577	Reapply "Delete TorchScript based Android demo app and point to ExecuTorch (#153633 )" (#153656 ) This reverts commit `ae0e8f0c73`. Keep android/libs/fbjni because it's being used by other components of PyTorch. Pull Request resolved: https://github.com/pytorch/pytorch/pull/153656 Approved by: https://github.com/malfet	2025-05-16 04:35:42 +00:00
PyTorch MergeBot	ae0e8f0c73	Revert "Delete TorchScript based Android demo app and point to ExecuTorch (#153633 )" This reverts commit `b22f01fcb9`. Reverted https://github.com/pytorch/pytorch/pull/153633 on behalf of https://github.com/malfet due to But libtorch build regressions are real, fbjni is still used for C++ builds ([comment](https://github.com/pytorch/pytorch/pull/153633#issuecomment-2884951805))	2025-05-15 20:16:05 +00:00
Mengwei Liu	b22f01fcb9	Delete TorchScript based Android demo app and point to ExecuTorch (#153633 ) Delete TorchScript demo app and point people to ExecuTorch demo app. Pull Request resolved: https://github.com/pytorch/pytorch/pull/153633 Approved by: https://github.com/Skylion007, https://github.com/malfet, https://github.com/atalman, https://github.com/janeyx99, https://github.com/seemethere	2025-05-15 18:43:59 +00:00
cyy	370e23388d	Set CMake 3.5 as minimum version in pytorch_android (#152769 ) I saw pytorch_android failure in docker image builds. This fix attempts to bypass CMake 4 limitations. Pull Request resolved: https://github.com/pytorch/pytorch/pull/152769 Approved by: https://github.com/Skylion007	2025-05-04 16:57:22 +00:00
Max Ren	cca34be584	Update XNNPACK Version (#139913 ) Updating XNNPACK Version to 4ea82e595b36106653175dcb04b2aa532660d0d8 submodule update Pull Request resolved: https://github.com/pytorch/pytorch/pull/139913 Approved by: https://github.com/digantdesai, https://github.com/huydhn	2024-11-18 18:16:31 +00:00
Max Ren	d92d4133e7	[8/n] Update XNNPACK Submodule Version Part 8 Everything Remaining to get it to work (#115714 ) > __Note:__ XNNPACK Upgrade is too large in the range of 40k files and 10m Lines of code, Thus we break the update of the library into multiple parts. All Parts [1 - n] Must be landed together for it to work. *This also means If there is a revert. Please revert the Entire Stack.* This change is everything remaining requiring XNNPACK version to work. @allow-large-files Differential Revision: [D52099769](https://our.internmc.facebook.com/intern/diff/D52099769/) --- submodule (unblock merge to make ShipIt happy) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115714 Approved by: https://github.com/digantdesai	2023-12-15 23:08:08 +00:00
Nikita Shulga	36ac095ff8	Migrate PyTorch to C++17 (#85969 ) With CUDA-10.2 gone we can finally do it! This PR mostly contains build system related changes, invasive functional ones are to be followed. Among many expected tweaks to the build system, here are few unexpected ones: - Force onnx_proto project to be updated to C++17 to avoid `duplicate symbols` error when compiled by gcc-7.5.0, as storage rule for `constexpr` changed in C++17, but gcc does not seem to follow it - Do not use `std::apply` on CUDA but rely on the built-in variant, as it results in test failures when CUDA runtime picks host rather than device function when `std::apply` is invoked from CUDA code. - `std::decay_t` -> `::std::decay_t` and `std::move`->`::std::move` as VC++ for some reason claims that `std` symbol is ambigious - Disable use of `std::aligned_alloc` on Android, as its `libc++` does not implement it. Some prerequisites: - https://github.com/pytorch/pytorch/pull/89297 - https://github.com/pytorch/pytorch/pull/89605 - https://github.com/pytorch/pytorch/pull/90228 - https://github.com/pytorch/pytorch/pull/90389 - https://github.com/pytorch/pytorch/pull/90379 - https://github.com/pytorch/pytorch/pull/89570 - https://github.com/facebookincubator/gloo/pull/336 - https://github.com/facebookincubator/gloo/pull/343 - `919676fb32` Fixes https://github.com/pytorch/pytorch/issues/56055 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85969 Approved by: https://github.com/ezyang, https://github.com/kulinseth	2022-12-08 02:27:48 +00:00
Tongliang Liao	dff70a5e1a	Make language std configurable. (#75519 ) RocksDB 7 starts to use C++17 in header. We should make this configurable, in case user needs higher std version. List of files to changed is found by `git grep 'CMAKE_[^_]*_STANDARD'`. Doc string is from CMake code. Pull Request resolved: https://github.com/pytorch/pytorch/pull/75519 Approved by: https://github.com/malfet	2022-07-13 14:21:27 +00:00
Chen Lai	0c3db1cb33	[Pytorch] Build lite interpreter as default for Android Summary: Build lite interpreter as default for android, should wait until https://github.com/pytorch/pytorch/pull/56002 lands Mainly two changes: 1. Use lite interpreter as default for Android 2. Switch the lite interpreter build test to full jit build test Test Plan: Imported from OSS Differential Revision: D27695530 Reviewed By: IvanKobzarev Pulled By: cccclai fbshipit-source-id: e1b2c70fee6590accc22c7404b9dd52c7d7c36e2	2021-05-17 14:12:48 -07:00
Ailing Zhang	eb52e36460	Revert D27469727: [pytorch][PR] [android] fbjni from prefab dependency 0.2.2 Test Plan: revert-hammer Differential Revision: D27469727 (`507b46f23e`) Original commit changeset: 2ab22879e81c fbshipit-source-id: d656463b81a02fbf870dded5d3868bb33e016fe0	2021-03-31 17:21:30 -07:00
Ivan Kobzarev	507b46f23e	[android] fbjni from prefab dependency 0.2.2 (#55066 ) Summary: Switching pytorch android to use fbjni from prefab dependencies Bumping version of fbjni to 0.2.2 soloader version to 0.10.1 Pull Request resolved: https://github.com/pytorch/pytorch/pull/55066 Reviewed By: dreiss Differential Revision: D27469727 Pulled By: IvanKobzarev fbshipit-source-id: 2ab22879e81c9f2acf56807c6a133b0ca20bb40a	2021-03-31 14:12:18 -07:00
Alexander Golynski	09756e7280	Revert D27370295: [android] fbjni android use prefab dependency, version 0.2.2 Test Plan: revert-hammer Differential Revision: D27370295 (`2bee09a577`) Original commit changeset: bde881a8d4ed fbshipit-source-id: 2fcc8f522fb08d4f8299f7e824341be32afb184a	2021-03-31 06:13:26 -07:00
Ivan Kobzarev	2bee09a577	[android] fbjni android use prefab dependency, version 0.2.2 (#54792 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54792 Test Plan: Imported from OSS Reviewed By: dreiss Differential Revision: D27370295 Pulled By: IvanKobzarev fbshipit-source-id: bde881a8d4edd4636aa4ec7cecbe770b5b65bb1f	2021-03-30 20:26:36 -07:00
Chen Lai	14f7bf0629	[PyTorch] update CMake to build libtorch lite (#51419 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51419 ## Summary 1. Add an option `BUILD_LITE_INTERPRETER` in `caffe2/CMakeLists.txt` and set `OFF` as default. 2. Update 'build_android.sh' with an argument to swtich `BUILD_LITE_INTERPRETER`, 'OFF' as default. 3. Add a mini demo app `lite_interpreter_demo` linked with `libtorch` library, which can be used for quick test. ## Test Plan Built lite interpreter version of libtorch and test with Image Segmentation demo app ([android version](https://github.com/pytorch/android-demo-app/tree/master/ImageSegmentation)/[ios version](https://github.com/pytorch/ios-demo-app/tree/master/ImageSegmentation)) ### Android 1. Prepare model: Prepare the lite interpreter version of model by run the script below to generate the scripted model `deeplabv3_scripted.pt` and `deeplabv3_scripted.ptl` ``` import torch model = torch.hub.load('pytorch/vision:v0.7.0', 'deeplabv3_resnet50', pretrained=True) model.eval() scripted_module = torch.jit.script(model) # Export full jit version model (not compatible lite interpreter), leave it here for comparison scripted_module.save("deeplabv3_scripted.pt") # Export lite interpreter version model (compatible with lite interpreter) scripted_module._save_for_lite_interpreter("deeplabv3_scripted.ptl") ``` 2. Build libtorch lite for android: Build libtorch for android for all 4 android abis (armeabi-v7a, arm64-v8a, x86, x86_64) `BUILD_LITE_INTERPRETER=1 ./scripts/build_pytorch_android.sh`. This pr is tested on Pixel 4 emulator with x86, so use cmd `BUILD_LITE_INTERPRETER=1 ./scripts/build_pytorch_android.sh x86` to specify abi to save built time. After the build finish, it will show the library path: ``` ... BUILD SUCCESSFUL in 55s 134 actionable tasks: 22 executed, 112 up-to-date + find /Users/chenlai/pytorch/android -type f -name 'aar' + xargs ls -lah -rw-r--r-- 1 chenlai staff 13M Feb 11 11:48 /Users/chenlai/pytorch/android/pytorch_android/build/outputs/aar/pytorch_android-release.aar -rw-r--r-- 1 chenlai staff 36K Feb 9 16:45 /Users/chenlai/pytorch/android/pytorch_android_torchvision/build/outputs/aar/pytorch_android_torchvision-release.aar ``` 3. Use the PyTorch Android libraries built from source in the ImageSegmentation app: Create a folder 'libs' in the path, the path from repository root will be `ImageSegmentation/app/libs`. Copy `pytorch_android-release` to the path `ImageSegmentation/app/libs/pytorch_android-release.aar`. Copy 'pytorch_android_torchvision` (downloaded from [here](https://oss.sonatype.org/#nexus-search;quick~torchvision_android)) to the path `ImageSegmentation/app/libs/pytorch_android_torchvision.aar` Update the `dependencies` part of `ImageSegmentation/app/build.gradle` to ``` dependencies { implementation 'androidx.appcompat:appcompat:1.2.0' implementation 'androidx.constraintlayout:constraintlayout:2.0.2' testImplementation 'junit:junit:4.12' androidTestImplementation 'androidx.test.ext:junit:1.1.2' androidTestImplementation 'androidx.test.espresso:espresso-core:3.3.0' implementation(name:'pytorch_android-release', ext:'aar') implementation(name:'pytorch_android_torchvision', ext:'aar') implementation 'com.android.support:appcompat-v7:28.0.0' implementation 'com.facebook.fbjni:fbjni-java-only:0.0.3' } ``` Update `allprojects` part in `ImageSegmentation/build.gradle` to ``` allprojects { repositories { google() jcenter() flatDir { dirs 'libs' } } } ``` 4. Update model loader api: Update `ImageSegmentation/app/src/main/java/org/pytorch/imagesegmentation/MainActivity.java` by 4.1 Add new import: `import org.pytorch.LiteModuleLoader;` 4.2 Replace the way to load pytorch lite model ``` // mModule = Module.load(MainActivity.assetFilePath(getApplicationContext(), "deeplabv3_scripted.pt")); mModule = LiteModuleLoader.load(MainActivity.assetFilePath(getApplicationContext(), "deeplabv3_scripted.ptl")); ``` 5. Test app: Build and run the ImageSegmentation app in Android Studio, ![image](https://user-images.githubusercontent.com/16430979/107696279-9cea5900-6c66-11eb-8286-4d1d68abff61.png) ### iOS 1. Prepare model: Same as Android. 2. Build libtorch lite for ios* `BUILD_PYTORCH_MOBILE=1 IOS_PLATFORM=SIMULATOR BUILD_LITE_INTERPRETER=1 ./scripts/build_ios.sh` 3. Remove Cocoapods from the project: run `pod deintegrate` 4. Link ImageSegmentation demo app with the custom built library: Open your project in XCode, go to your project Target’s Build Phases - Link Binaries With Libraries, click the + sign and add all the library files located in `build_ios/install/lib`. Navigate to the project Build Settings, set the value Header Search Paths to `build_ios/install/include` and Library Search Paths to `build_ios/install/lib`. In the build settings, search for other linker flags. Add a custom linker flag below ``` -all_load ``` Finally, disable bitcode for your target by selecting the Build Settings, searching for Enable Bitcode, and set the value to No. 5. Update library and api 5.1 Update `TorchModule.mm`` To use the custom built libraries the project, replace `#import <LibTorch/LibTorch.h>` (in `TorchModule.mm`) which is needed when using LibTorch via Cocoapods with the code below: ``` //#import <LibTorch/LibTorch.h> #include "ATen/ATen.h" #include "caffe2/core/timer.h" #include "caffe2/utils/string_utils.h" #include "torch/csrc/autograd/grad_mode.h" #include "torch/script.h" #include <torch/csrc/jit/mobile/function.h> #include <torch/csrc/jit/mobile/import.h> #include <torch/csrc/jit/mobile/interpreter.h> #include <torch/csrc/jit/mobile/module.h> #include <torch/csrc/jit/mobile/observer.h> ``` 5.2 Update `ViewController.swift` ``` // if let filePath = Bundle.main.path(forResource: // "deeplabv3_scripted", ofType: "pt"), // let module = TorchModule(fileAtPath: filePath) { // return module // } else { // fatalError("Can't find the model file!") // } if let filePath = Bundle.main.path(forResource: "deeplabv3_scripted", ofType: "ptl"), let module = TorchModule(fileAtPath: filePath) { return module } else { fatalError("Can't find the model file!") } ``` ### Unit test Add `test/cpp/lite_interpreter`, with one unit test `test_cores.cpp` and a light model `sequence.ptl` to test `_load_for_mobile()`, `bc.find_method()` and `bc.forward()` functions. ### Size: With the change: Android: x86: `pytorch_android-release.aar` (13.8 MB) IOS: `pytorch/build_ios/install/lib` (lib: 66 MB): ``` (base) chenlai@chenlai-mp lib % ls -lh total 135016 -rw-r--r-- 1 chenlai staff 3.3M Feb 15 20:45 libXNNPACK.a -rw-r--r-- 1 chenlai staff 965K Feb 15 20:45 libc10.a -rw-r--r-- 1 chenlai staff 4.6K Feb 15 20:45 libclog.a -rw-r--r-- 1 chenlai staff 42K Feb 15 20:45 libcpuinfo.a -rw-r--r-- 1 chenlai staff 39K Feb 15 20:45 libcpuinfo_internals.a -rw-r--r-- 1 chenlai staff 1.5M Feb 15 20:45 libeigen_blas.a -rw-r--r-- 1 chenlai staff 148K Feb 15 20:45 libfmt.a -rw-r--r-- 1 chenlai staff 44K Feb 15 20:45 libpthreadpool.a -rw-r--r-- 1 chenlai staff 166K Feb 15 20:45 libpytorch_qnnpack.a -rw-r--r-- 1 chenlai staff 384B Feb 15 21:19 libtorch.a -rw-r--r-- 1 chenlai staff 60M Feb 15 20:47 libtorch_cpu.a ``` `pytorch/build_ios/install`: ``` (base) chenlai@chenlai-mp install % du -sh * 14M include 66M lib 2.8M share ``` Master (baseline): Android: x86: `pytorch_android-release.aar` (16.2 MB) IOS: `pytorch/build_ios/install/lib` (lib: 84 MB): ``` (base) chenlai@chenlai-mp lib % ls -lh total 172032 -rw-r--r-- 1 chenlai staff 3.3M Feb 17 22:18 libXNNPACK.a -rw-r--r-- 1 chenlai staff 969K Feb 17 22:18 libc10.a -rw-r--r-- 1 chenlai staff 4.6K Feb 17 22:18 libclog.a -rw-r--r-- 1 chenlai staff 42K Feb 17 22:18 libcpuinfo.a -rw-r--r-- 1 chenlai staff 1.5M Feb 17 22:18 libeigen_blas.a -rw-r--r-- 1 chenlai staff 44K Feb 17 22:18 libpthreadpool.a -rw-r--r-- 1 chenlai staff 166K Feb 17 22:18 libpytorch_qnnpack.a -rw-r--r-- 1 chenlai staff 384B Feb 17 22:19 libtorch.a -rw-r--r-- 1 chenlai staff 78M Feb 17 22:19 libtorch_cpu.a ``` `pytorch/build_ios/install`: ``` (base) chenlai@chenlai-mp install % du -sh * 14M include 84M lib 2.8M share ``` Test Plan: Imported from OSS Reviewed By: iseeyuan Differential Revision: D26518778 Pulled By: cccclai fbshipit-source-id: 4503ffa1f150ecc309ed39fb0549e8bd046a3f9c	2021-02-21 01:43:54 -08:00
Ivan Kobzarev	dbfaf966b0	[android] turn on USE_VULKAN for android builds by default (#51291 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51291 Turning on USE_VULKAN for android builds Remove standalone android vulkan build Testing all ci jobs (for master): https://github.com/pytorch/pytorch/pull/51292 Test Plan: Imported from OSS Reviewed By: AshkanAliabadi Differential Revision: D26141891 Pulled By: IvanKobzarev fbshipit-source-id: e8e1a4ab612c0786ce09217ab9370fd75a71eb00	2021-01-29 11:58:21 -08:00
Elijah Rippeth	aaf6582d02	fix issue by which pytorch_jni is not bundled in libtorch (#46466 ) Summary: Fixes issue with pytorch_jni.dll not being installed correctly in libtorch. Pull Request resolved: https://github.com/pytorch/pytorch/pull/46466 Reviewed By: zhangguanheng66 Differential Revision: D25247564 Pulled By: ezyang fbshipit-source-id: a509476ec4a0863fd67da3258e9300a9527d4f3b	2020-12-01 11:38:56 -08:00
Elijah Rippeth	b5479737d7	Add windows JNI support (#44257 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/32516 Pull Request resolved: https://github.com/pytorch/pytorch/pull/44257 Reviewed By: malfet Differential Revision: D24332820 Pulled By: ezyang fbshipit-source-id: 1dd97e9c8140129a02a9078623b190b33f30d5b0	2020-10-15 10:48:45 -07:00
David Reiss	b7e044f0e5	Re-apply PyTorch pthreadpool changes Summary: This re-applies D21232894 (`b9d3869df3`) and D22162524, plus updates jni_deps in a few places to avoid breaking host JNI tests. Test Plan: `buck test @//fbandroid/mode/server //fbandroid/instrumentation_tests/com/facebook/caffe2:host-test` Reviewed By: xcheng16 Differential Revision: D22199952 fbshipit-source-id: df13eef39c01738637ae8cf7f581d6ccc88d37d5	2020-06-23 19:26:21 -07:00
Kate Mormysh	92d3182c11	Revert D21232894: Unify PyTorch mobile's threadpool usage. Test Plan: revert-hammer Differential Revision: D21232894 (`b9d3869df3`) Original commit changeset: 8b3de86247fb fbshipit-source-id: e6517cfec08f7dd0f4f8877dab62acf1d65afacd	2020-06-23 17:09:14 -07:00
Ashkan Aliabadi	b9d3869df3	Unify PyTorch mobile's threadpool usage. (#37243 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37243 * Why * As it stands, we have two thread pool solutions concurrently in use in PyTorch mobile: (1) the open source pthreadpool library under third_party, and (2) Caffe2's implementation of pthreadpool under caffe2/utils/threadpool. Since the primary use-case of the latter has been to act as a drop-in replacement for the third party version so as to enable integration and usage from within NNPACK and QNNPACK, Caffe2's implementation is intentionally written to the exact same interface as the third party version. The original argument in favor of C2's implementation has been improved performance as a result of using spin locks, as opposed to relinquishing the thread's time slot and putting it to sleep - a less expensive operation up to a point. That seems to have given C2's implementation the upper hand in performance, hence justifying the added maintenance complexity, until the third party version improved in parallel surpassing the efficiency of C2's implementation as I have verified in benchmarks. With that advantage gone, there is no reason to continue using C2's implementation in PyTorch mobile either from the perspective of performance or code hygiene. As a matter of fact, there is considerable performance benefit to be had as a result of using the third party version as it currently stands. This is a tricky change though, mainly because in order to avoid potential performance regressions, of which I have witnessed none but just in abundance of caution, we have decided to continue using the internal C2's implementation whenever building for Caffe2. Again, this is mainly to avoid potential performance regressions in production C2 use cases even if doing so results in reduced performance as far as I can tell. So to summarize, today, and as it currently stands, we are using C2's implementation for (1) NNPACK, (2) PyTorch QNNPACK, and (3) ATen parallel_for on mobile builds, while using the third party version of pthreadpool for XNNPACK as XNNPACK does not provide any build options to link against an external implementation unlike NNPACK and QNNPACK do. The goal of this PR then, is to unify all usage on mobile to the third party implementation both for improved performance and better code hygiene. This applies to PyTorch's use of NNPACK, QNNPACK, XNNPACK, and mobile's implementation of ATen parallel_for, all getting routed to the exact same third party implementation in this PR. Considering that NNPACK, QNNPACK, and XNNPACK are not mobile specific, these benefits carry over to non-mobile builds of PyTorch (but not Caffe2) as well. The implementation of ATen parallel_for on non-mobile builds remains unchanged. * How * This is where things get tricky. A good deal of the build system complexity in this PR arises from our desire to maintain C2's implementation intact for C2's use. pthreadpool is a C library with no concept of namespaces, which means two copies of the library cannot exist in the same binary or symbol collision will occur violating ODR. This means that somehow, and based on some condition, we must decide on the choice of a pthreadpool implementation. In practice, this has become more complicated as a result of all the possible combinations that USE_NNPACK, USE_QNNPACK, USE_PYTORCH_QNNPACK, USE_XNNPACK, USE_SYSTEM_XNNPACK, USE_SYSTEM_PTHREADPOOL and other variables can result in. Having said that, I have done my best in this PR to surgically cut through this complexity in a way that minimizes the side effects, considering the significance of the performance we are leaving on the table, yet, as a result of this combinatorial explosion explained above I cannot guarantee that every single combination will work as expected on the first try. I am heavily relying on CI to find any issues as local testing can only go that far. Having said that, this PR provides a simple non mobile-specific C++ thread pool implementation on top of pthreadpool, namely caffe2::PThreadPool that automatically routes to C2's implementation or the third party version depending on the build configuration. This simplifies the logic at the cost of pushing the complexity to the build scripts. From there on, this thread pool is used in aten parallel_for, and NNPACK and family, again, routing all usage of threading to C2 or third party pthreadpool depending on the build configuration. When it is all said or done, the layering will look like this: a) aten::parallel_for, uses b) caffe2::PThreadPool, which uses c) pthreadpool C API, which delegates to c-1) third_party implementation of pthreadpool if that's what the build has requested, and the rabbit hole ends here. c-2) C2's implementation of pthreadpool if that's what the build has requested, which itself delegates to c-2-1) caffe2::ThreadPool, and the rabbit hole ends here. NNPACK, and (PyTorch) QNNPACK directly hook into (c). They never go through (b). Differential Revision: D21232894 Test Plan: Imported from OSS Reviewed By: dreiss Pulled By: AshkanAliabadi fbshipit-source-id: 8b3de86247fbc3a327e811983e082f9d40081354	2020-06-23 16:34:51 -07:00
Ivan Kobzarev	0891764e80	[android] ANDROID_STL=c++_shared (#39588 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39588 Before this diff we used c++_static linking. Users will dynamically link to libpytorch_jni.so and have at least one more their own shared library that probably uses stl library. We must have not more than one stl per app. ( https://developer.android.com/ndk/guides/cpp-support#one_stl_per_app ) To have only one stl per app changing ANDROID_STL way to c++_shared, that will add libc++_shared.so to packaging. Test Plan: Imported from OSS Differential Revision: D22118031 Pulled By: IvanKobzarev fbshipit-source-id: ea1e5085ae207a2f42d1fa9f6ab8ed0a21768e96	2020-06-18 13:50:05 -07:00
Ivan Kobzarev	928e99b9bb	[vulkan] jni build support USE_VULKAN (#39188 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39188 Extracting Vulkan_LIBS and Vulkan_INCLUDES setup from `cmake/Dependencies.cmake` to `cmake/VulkanDependencies.cmake` and reuse it in android/pytorch_android/CMakeLists.txt Adding control to build with Vulkan setting env variable `USE_VULKAN` for `scripts/build_android.sh` `scripts/build_pytorch_android.sh` We do not use Vulkan backend in pytorch_android, but with this build option we can track android aar change with `USE_VULKAN` added. Currently it is 88Kb. Test Plan: Imported from OSS Differential Revision: D21770892 Pulled By: IvanKobzarev fbshipit-source-id: a39433505fdcf43d3b524e0fe08062d5ebe0d872	2020-05-28 15:39:02 -07:00
Nikita Shulga	b9adbb5002	Fix/relax CMake linter rules (#35574 ) Summary: Ignore mixed upper-case/lower-case style for now Fix space between function and its arguments violation Pull Request resolved: https://github.com/pytorch/pytorch/pull/35574 Test Plan: CI Differential Revision: D20712969 Pulled By: malfet fbshipit-source-id: 0012d430aed916b4518599a0b535e82d15721f78	2020-03-27 16:52:33 -07:00
Ashkan Aliabadi	6aecfd1e80	Mobile Backend: NHWC memory layout + XNNPACK integration. (#33722 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33722 In order to improve CPU performance on floating-point models on mobile, this PR introduces a new CPU backend for mobile that implements the most common mobile operators with NHWC memory layout support through integration with XNNPACK. XNNPACK itself, and this codepath, are currently only included in the build, but the actual integration is gated with USE_XNNPACK preprocessor guards. This preprocessor symbol is intentionally not passed on to the compiler, so as to enable this rollout in multiple stages in follow up PRs. This changeset will build XNNPACK as part of the build if the identically named USE_XNNPACK CMAKE variable, defaulted to ON, is enabled, but will not actually expose or enable this code path in any other way. Furthermore, it is worth pointing out that in order to efficiently map models to these operators, some front-end method of exposing this backend to the user is needed. The less efficient implementation would be to hook these operators into their corresponding native implementations, granted that a series of XNNPACK-specific conditions are met, much like how NNPACK is integrated with PyTorch today for instance. Having said that, while the above implementation is still expected to outperform NNPACK based on the benchmarks I ran, the above integration would be leave a considerable gap between the performance achieved and the maximum performance potential XNNPACK enables, as it does not provide a way to compute and factor out one-time operations out of the inner most forward() loop. The more optimal solution, and one we will decide on soon, would involve either providing a JIT pass that maps nn operators onto these newly introduced operators, while allowing one-time calculations to be factored out, much like quantized mobile models. Alternatively, new eager-mode modules can also be introduced that would directly call into these implementations either through c10 or some other mechanism, also allowing for decoupling of op creation from op execution. This PR does not include any of the front end changes mentioned above. Neither does it include the mobile threadpool unification present in the original https://github.com/pytorch/pytorch/issues/30644. Furthermore, this codepath seems to be faster than NNPACK in a good number of use cases, which can potentially allow us to remove NNPACK from aten to make the codebase a little simpler, granted that there is widespread support for such a move. Regardless, these changes will be introduced gradually and in a more controlled way in subsequent PRs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/32509 Test Plan: Build: CI Functionality: Not exposed Reviewed By: dreiss Differential Revision: D20069796 Pulled By: AshkanAliabadi fbshipit-source-id: d46c1c91d4bea91979ea5bd46971ced5417d309c	2020-02-24 21:58:56 -08:00
Ashkan Aliabadi	039dc90854	Revert D19521853: [pytorch][PR] Mobile Backend: NHWC memory layout + XNNPACK integration. Test Plan: revert-hammer Differential Revision: D19521853 Original commit changeset: 99a1fab31d0e fbshipit-source-id: 76dfc1f481797ba2386997533cf19957637687d6	2020-02-23 22:07:19 -08:00
Ashkan Aliabadi	941b42428a	Mobile Backend: NHWC memory layout + XNNPACK integration. (#32509 ) Summary: In order to improve CPU performance on floating-point models on mobile, this PR introduces a new CPU backend for mobile that implements the most common mobile operators with NHWC memory layout support through integration with XNNPACK. XNNPACK itself, and this codepath, are currently only included in the build, but the actual integration is gated with USE_XNNPACK preprocessor guards. This preprocessor symbol is intentionally not passed on to the compiler, so as to enable this rollout in multiple stages in follow up PRs. This changeset will build XNNPACK as part of the build if the identically named USE_XNNPACK CMAKE variable, defaulted to ON, is enabled, but will not actually expose or enable this code path in any other way. Furthermore, it is worth pointing out that in order to efficiently map models to these operators, some front-end method of exposing this backend to the user is needed. The less efficient implementation would be to hook these operators into their corresponding native implementations, granted that a series of XNNPACK-specific conditions are met, much like how NNPACK is integrated with PyTorch today for instance. Having said that, while the above implementation is still expected to outperform NNPACK based on the benchmarks I ran, the above integration would be leave a considerable gap between the performance achieved and the maximum performance potential XNNPACK enables, as it does not provide a way to compute and factor out one-time operations out of the inner most forward() loop. The more optimal solution, and one we will decide on soon, would involve either providing a JIT pass that maps nn operators onto these newly introduced operators, while allowing one-time calculations to be factored out, much like quantized mobile models. Alternatively, new eager-mode modules can also be introduced that would directly call into these implementations either through c10 or some other mechanism, also allowing for decoupling of op creation from op execution. This PR does not include any of the front end changes mentioned above. Neither does it include the mobile threadpool unification present in the original https://github.com/pytorch/pytorch/issues/30644. Furthermore, this codepath seems to be faster than NNPACK in a good number of use cases, which can potentially allow us to remove NNPACK from aten to make the codebase a little simpler, granted that there is widespread support for such a move. Regardless, these changes will be introduced gradually and in a more controlled way in subsequent PRs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/32509 Reviewed By: dreiss Differential Revision: D19521853 Pulled By: AshkanAliabadi fbshipit-source-id: 99a1fab31d0ece64961df074003bb852c36acaaa	2020-02-23 19:08:42 -08:00
David Reiss	e4f43bf7a5	Set rpath for JNI library on Mac (#32247 ) Summary: Without this, dlopen won't look in the proper directory for dependencies (like libtorch and fbjni). Pull Request resolved: https://github.com/pytorch/pytorch/pull/32247 Test Plan: Build libpytorch_jni.dylib on Mac, replaced the one from the libtorch nightly, and was able to run the Java demo. Differential Revision: D19501498 Pulled By: dreiss fbshipit-source-id: 13ffdff9622aa610f905d039f951ee9a3fdc6b23	2020-01-21 11:30:39 -08:00
Edward Yang	38986e1dea	Split libtorch.so back into libtorch_{cpu,cuda,hip} (#30315 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30315 The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. This is a reland of https://github.com/pytorch/pytorch/pull/29731 but I've extracted all of the prep work into separate PRs which can be landed before this one. Some things of note: * torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) * The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/libprotobuf.a(arena.cc.o) is referenced by DSO" * A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly * I had to torch_cpu/torch_cuda caffe2_interface_library so that they get whole-archived linked into torch when you statically link. And I had to do this in an exported fashion because torch needs to depend on torch_cpu_library. In the end I exported everything and removed the redefinition in the Caffe2Config.cmake. However, I am not too sure why the old code did it in this way in the first place; however, it doesn't seem to have broken anything to switch it this way. * There's some uses of `__HIP_PLATFORM_HCC__` still in `torch_cpu` code, so I had to apply it to that library too (UGH). This manifests as a failer when trying to run the CUDA fuser. This doesn't really matter substantively right now because we still in-place HIPify, but it would be good to fix eventually. This was a bit difficult to debug because of an unrelated HIP bug, see https://github.com/ROCm-Developer-Tools/HIP/issues/1706 Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D18790941 Pulled By: ezyang fbshipit-source-id: 01296f6089d3de5e8365251b490c51e694f2d6c7	2019-12-04 08:04:57 -08:00
Sebastian Messmer	bc2e6d10fa	Back out "Revert D17908478: Switch PyTorch/Caffe2 to C++14" Summary: Original commit changeset: 775d2e29be0b Test Plan: CI Reviewed By: mruberry Differential Revision: D18775520 fbshipit-source-id: a350b3f86b66d97241f208786ee67e9a51172eac	2019-12-03 14:33:43 -08:00
Sebastian Messmer	a2ed50c920	Revert D17908478: Switch PyTorch/Caffe2 to C++14 Test Plan: revert-hammer Differential Revision: D17908478 Original commit changeset: 6e340024591e fbshipit-source-id: 775d2e29be0bc3a0db64f164c8960c44d4877d5d	2019-11-27 14:57:05 -08:00
Sebastian Messmer	d0acc9c085	Switch PyTorch/Caffe2 to C++14 (#30406 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30406 ghstack-source-id: 94642238 Test Plan: waitforsandcastle Differential Revision: D17908478 fbshipit-source-id: 6e340024591ec2c69521668022999df4a33b4ddb	2019-11-27 10:47:31 -08:00
David Reiss	e5fc86130a	Remove unnecessary linker flags from JNI host build (#30206 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30206 - --whole-archive isn't needed because we link libtorch as a dynamic dependency, rather than static. - --gc-sections isn't necessary because most (all?) of the code in our JNI library is used (and we're not staticly linking libtorch). Removing this one is useful because it's not supported by lld. Test Plan: Built on Linux. Library size was unchanged. Upcoming diff enables Mac JNI build. Differential Revision: D18653500 Pulled By: dreiss fbshipit-source-id: 49ce46fb86a775186f803ada50445b4b2acb54a8	2019-11-21 20:10:06 -08:00
Junjie Bai	352731bd6e	Revert D18632773: Split libtorch.so back into libtorch_{cpu,cuda,hip} Test Plan: revert-hammer Differential Revision: D18632773 Original commit changeset: ea717c81e0d7 fbshipit-source-id: 18601439f9f81c9f389020e5a0e4e04adb21772d	2019-11-21 15:01:09 -08:00
Edward Yang	ec30d9028a	Split libtorch.so back into libtorch_{cpu,cuda,hip} (#29731 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29731 The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. Some subtleties about the patch: - There were a few functions that crossed CPU-CUDA boundary without API macros. I just added them, easy enough. An inverse situation was aten/src/THC/THCTensorRandom.cu where we weren't supposed to put API macros directly in a cpp file. - DispatchStub wasn't getting all of its symbols related to static members on DispatchStub exported properly. I tried a few fixes but in the end I just moved everyone off using DispatchStub to dispatch CUDA/HIP (so they just use normal dispatch for those cases.) Additionally, there were some mistakes where people incorrectly were failing to actually import the declaration of the dispatch stub, so added includes for those cases. - torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) - The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 - In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/l ibprotobuf.a(arena.cc.o) is referenced by DSO" - A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly. This situation also happens with custom C++ extensions. - There's a ROCm compiler bug where extern "C" on functions is not respected. There's a little workaround to handle this. - Because I was too lazy to check if HIPify was converting TORCH_CUDA_API into TORCH_HIP_API, I just made it so HIP build also triggers the TORCH_CUDA_API macro. Eventually, we should translate and keep the nature of TORCH_CUDA_API constant in all cases. Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D18632773 Pulled By: ezyang fbshipit-source-id: ea717c81e0d7554ede1dc404108603455a81da82	2019-11-21 11:27:33 -08:00
David Reiss	d22f61432d	Update fbjni and enable PyTorch JNI build Summary: - Add a "BUILD_JNI" option that enables building PyTorch JNI bindings and fbjni. This is off by default because it adds a dependency on jni.h. - Update to the latest fbjni so we can inhibit building its tests, because they depend on gtest. - Set JAVA_HOME and BUILD_JNI in Linux binary build configurations if we can find jni.h in Docker. Test Plan: - Built on dev server. - Verified that libpytorch_jni links after libtorch when both are built in a parallel build. Differential Revision: D18536828 fbshipit-source-id: 19cb3be8298d3619352d02bb9446ab802c27ec66	2019-11-15 13:59:44 -08:00
Ivan Kobzarev	aa6e992ffb	Subscribe for record function and if android do atrace (#28708 ) Summary: ghstack-source-id: `5edaf47155` Pull Request resolved: https://github.com/pytorch/pytorch/pull/28708 Some cpp formatting changes as I run `clang-format -i` Testing on devserver: make assets (models): ``` pushd android/test_app/; python make_assets.py; popd ``` Build test_app apk: ``` TRACE_ENABLED=1 sh android/build_test_app.sh find . -type f -name *apk ./android/test_app/app/build/outputs/apk/mobNet2Quant/debug/test_app-mobNet2Quant-debug.apk ./android/test_app/app/build/outputs/apk/resnet18/debug/test_app-resnet18-debug.apk ``` Install apk: `adb install -r test_app-mobNet2Quant-debug.apk` Run app on the device. Systrace: ``` $ANDROID_HOME/platform-tools/systrace/systrace.py -t 10 -a org.pytorch.testapp.mobNet2Quant sched freq idle am wm gfx view binder_driver hal dalvik camera input res -o trace.html ``` trace.html contains sections like `jni::Module::forward` ![Screenshot 2019-11-12 18 36 30](https://user-images.githubusercontent.com/6638825/68728156-5d245580-057b-11ea-9e71-e47681894fe4.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/28712 Differential Revision: D18495898 Pulled By: IvanKobzarev fbshipit-source-id: 0bced4a442f9dd90525520972a2c1f5d51f57df3	2019-11-13 20:55:40 -08:00
Xingying Cheng	5654eccfe2	Add pytorch_jni_lite for lite interpreter. (#29621 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29621 Add pytorch_jni_lite for lite interpreter. ghstack-source-id: 93867325 Test Plan: buck build xplat/caffe2/android:pytorch-jni buck build xplat/caffe2/android:pytorch buck install -r fb4a Reviewed By: dreiss Differential Revision: D18438343 fbshipit-source-id: 7d4dee11d352cc9a67339c45d9d7f4a2ba285ebc	2019-11-13 16:16:29 -08:00
David Reiss	c1140f20dc	Rename PyTorch JNI library to pytorch_jni (#29412 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29412 Originally, this was going to be Android-only, so the name wasn't too important. But now that we're planning to distribute it with libtorch, we should give it a more distinctive name. Test Plan: Ran tests according to https://github.com/pytorch/pytorch/issues/6570#issuecomment-548537834 Reviewed By: IvanKobzarev Differential Revision: D18405207 fbshipit-source-id: 0e6651cb34fb576438f24b8a9369e10adf9fecf9	2019-11-08 14:29:13 -08:00
David Reiss	80e270a76c	Add support for host build to pytorch_android native code (#27664 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27664 When ANDROID_ABI is not set, find libtorch headers and libraries from the LIBTORCH_HOME build variable (which must be set by hand), place output under a "host" directory, and use dynamic linking instead of static. This doesn't actually work without some local changes to fbjni, but I want to get the changes landed to avoid unnecessary merge conflicts. Test Plan: Imported from OSS Differential Revision: D18210315 Pulled By: dreiss fbshipit-source-id: 685a62de3c2a0a52bec7fd6fb95113058456bac8	2019-10-29 16:04:18 -07:00
David Reiss	34455c68b5	Remove unnecessary BUILD_DIR variable in Android CMake build (#27663 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27663 CMake sets CMAKE_BINARY_DIR and creates it automatically. Using this allows us to use the -B command-line flag to CMake to specify an alternate output directory. Test Plan: Imported from OSS Differential Revision: D18210316 Pulled By: dreiss fbshipit-source-id: ba2f6bd4b881ddd00de73fe9c33d82645ad5495d	2019-10-29 16:04:13 -07:00
Jiakai Liu	8f54d0d6b6	update android/iOS build library packing (#26565 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26565 For OSS mobile build we should keep QNNPACK off and PYTORCH_QNNPACK on as we don't include caffe2 ops that use third_party/QNNPACK. Update android/iOS build script to include new libraries accordingly. Test Plan: - CI build Differential Revision: D17508918 Pulled By: ljk53 fbshipit-source-id: 0483d45646d4d503b4e5c1d483e4df72cffc6c68	2019-09-20 17:48:15 -07:00
Jiakai Liu	d6e3aed032	add eigen blas for mobile build (#26508 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26508 Enable BLAS for pytorch mobile build using Eigen BLAS. It's not most juicy optimization for typical mobile CV models as we are already using NNPACK/QNNPACK for most ops there. But it's nice to have good fallback implementation for other ops. Test Plan: - Create a simple matrix multiplication script model: ``` import torch class Net(torch.nn.Module): def __init__(self): super(Net, self).__init__() self.weights = torch.ones(1000, 1000) def forward(self, x): return torch.mm(x, self.weights) n = Net() module = torch.jit.trace_module(n, {'forward': torch.ones(1000, 1000)}) module.save('mm.pk') ``` - Before integrate with eigen blas: ``` adb shell 'cd /data/local/tmp; \ ./speed_benchmark_torch \ --model=mm.pk \ --input_dims="1000,1000" \ --input_type=float \ --warmup=5 \ --iter=5' Milliseconds per iter: 2218.52. ``` - After integrate with eigen blas: ``` adb shell 'cd /data/local/tmp; \ ./speed_benchmark_torch_eigen \ --model=mm.pk \ --input_dims="1000,1000" \ --input_type=float \ --warmup=5 \ --iter=5' Milliseconds per iter: 314.535. ``` - Improve MobileNetV2 single thread perf by ~5%: ``` adb shell 'cd /data/local/tmp; \ ./speed_benchmark_torch \ --model=mobilenetv2.pk \ --input_dims="1,3,224,224" \ --input_type=float \ --warmup=5 \ --iter=20 \ --print_output=false \ --caffe2_threadpool_force_inline=true' Milliseconds per iter: 367.055. adb shell 'cd /data/local/tmp; \ ./speed_benchmark_torch_eigen \ --model=mobilenetv2.pk \ --input_dims="1,3,224,224" \ --input_type=float \ --warmup=5 \ --iter=20 \ --print_output=false \ --caffe2_threadpool_force_inline=true' Milliseconds per iter: 348.77. ``` Differential Revision: D17489587 fbshipit-source-id: efe542db810a900f680da7ec7e60f215f58db66e	2019-09-20 15:45:11 -07:00
Jiakai Liu	6fcbc37753	improve how pytorch_android cmake imports static lib (#26525 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26525 Create a util function to avoid boilerplate code as we are adding more libraries. Test Plan: - build CI; Differential Revision: D17495394 Pulled By: ljk53 fbshipit-source-id: 9e19f96ede4867bdff5157424fa68b71e6cff8bf	2019-09-20 15:45:06 -07:00
Jiakai Liu	9f4174c496	expose USE_STATIC_DISPATCH macro to public headers Summary: USE_STATIC_DISPATCH needs to be exposed as we don't hide header files containing it for iOS (yet). Otherwise it's error-prone to request all external projects to set the macro correctly on their own. Also remove redundant USE_STATIC_DISPATCH definition from other places. Test Plan: - build android gradle to confirm linker can still strip out dead code; - integrate with demo app to confirm inference can run without problem; Differential Revision: D17484260 Pulled By: ljk53 fbshipit-source-id: 653f597acb2583761b723eff8026d77518007533	2019-09-20 14:01:49 -07:00
Ashkan Aliabadi	dc851ab5d4	Integrate forked QNNPACK into mobile PyTorch builds. (#25844 ) Summary: Enable forked QNNPACK builds in PyTorch mobile. Pull Request resolved: https://github.com/pytorch/pytorch/pull/25844 Differential Revision: D17336458 Pulled By: AshkanAliabadi fbshipit-source-id: 6ea09dd6c114b64313e9159bf7f17253bc87bfdb	2019-09-16 20:50:43 -07:00
Jiakai Liu	ffee507d36	change gradle build to use static libtorch + gc-sections (#25984 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25984 Link static libtorch libraries into pytorch.so (API library for android) with "-Wl,--gc-sections" flag to remove unused symbols in libtorch. Test Plan: - full gradle CI with stacked PR; - will check final artifacts.tgz size change; Differential Revision: D17312859 Pulled By: ljk53 fbshipit-source-id: 99584d15922867a7b3c3d661ba238a6f99f43db5	2019-09-12 15:12:45 -07:00
Ivan Kobzarev	d62bca9792	jni-java wrapper for pytorchScript api (#25084 ) Summary: TLDR; initial commit of android java-jni wrapper of pytorchscript c++ api The main idea is to provide java interface for android developers to use pytorchscript modules. java API tries to repeat semantic of c++ and python pytorchscript API org.pytorch.Module (wrapper of torch::jit::script::Module) - static Module load(String path) - IValue forward(IValue... inputs) - IValue runMethod(String methodName, IValue... inputs) org.pytorch.Tensor (semantic of at::Tensor) - newFloatTensor(long[] dims, float[] data) - newFloatTensor(long[] dims, FloatBuffer data) - newIntTensor(long[] dims, int[] data) - newIntTensor(long[] dims, IntBuffer data) - newByteTensor(long[] dims, byte[] data) - newByteTensor(long[] dims, ByteBuffer data) org.pytorch.IValue (semantic of at::IValue) - static factory methods to create pytorchscript supported types Examples of usage api could be found in PytorchInstrumentedTests.java: Module module = Module.load(path); IValue input = IValue.tensor(Tensor.newByteTensor(new long[]{1}, Tensor.allocateByteBuffer(1))); IValue output = module.forward(input); Tensor outputTensor = output.getTensor(); ThreadSafety: Api is not thread safe, all synchronization must be done on caller side. Mutability: org.pytorch.Tensor buffer is DirectBuffer with native byte order, can be created with static factory methods specifing DirectBuffer. At the moment org.pytorch.Tensor does not hold at::Tensor on jni side, it has: long[] dimensions, type, DirectByteBuffer blobData Input tensors are mutable (can be modified and used for the next inference), Uses values from buffer on the momment of Module#forward or Module#runMethod calls. Buffers of input tensors is used directly by input at::Tensor Output is copied from output at::Tensor and is immutable. Dependencies: Jni level is implemented with usage of fbjni library, that was developed in Facebook, and was already used and opensourced in several opensource projects, added to the repo as submodule from personal account to be able to switch submodule when fbjni will be opensourced separately. ghstack-source-id: b39c848359a70d717f2830a15265e4aa122279c0 Pull Request resolved: https://github.com/pytorch/pytorch/pull/25084 Pull Request resolved: https://github.com/pytorch/pytorch/pull/25105 Reviewed By: dreiss Differential Revision: D16988107 Pulled By: IvanKobzarev fbshipit-source-id: 41ca7c9869f8370b8504c2ef8a96047cc16516d4	2019-08-23 10:42:44 -07:00

48 Commits