pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Nikita Shulga	0910429d72	[BE][CMake] Use FindPython module (#124613 ) As FindPythonInterp and FindPythonLibs has been deprecated since cmake-3.12 Replace `PYTHON_EXECUTABLE` with `Python_EXECUTABLE` everywhere (CMake variable names are case-sensitive) This makes PyTorch buildable with python3 binary shipped with XCode on MacOS TODO: Get rid of `FindNumpy` as its part of Python package Pull Request resolved: https://github.com/pytorch/pytorch/pull/124613 Approved by: https://github.com/cyyever, https://github.com/Skylion007	2024-05-29 13:17:35 +00:00
cyy	83845a7c78	[1/2] Remove caffe2 db and distributed from build system (#125092 ) This PR tries to decompose https://github.com/pytorch/pytorch/pull/122527 into a smaller one. Caffe2 db, distributed and some binaries have been removed. To be noted, this was inspired and is co-dev with @r-barnes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/125092 Approved by: https://github.com/malfet	2024-05-04 06:48:46 +00:00
Remi Domingues	fdbbd20f32	Cache conda and pip for IOS CI (#91359 ) Fixes T137630520 Caching for conda and pip dependencies for iOS CI workflow. - Conda and pip dependencies have been moved from [_ios-build-test.yml](https://github.com/pytorch/pytorch/blob/master/.github/workflows/_ios-build-test.yml) to dedicated requirements files - Miniconda shell installation has been replaced by `setup-miniconda@main` which supports caching Pull Request resolved: https://github.com/pytorch/pytorch/pull/91359 Approved by: https://github.com/malfet, https://github.com/huydhn	2022-12-30 17:52:20 +00:00
John Detloff	e0229d6517	Remove caffe2 mobile (#84338 ) We're no longer building Caffe2 mobile as part of our CI, and it adds a lot of clutter to our make files. Any lingering internal dependencies will use the buck build and so wont be effected. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84338 Approved by: https://github.com/dreiss	2022-09-08 01:49:55 +00:00
Jing Xu	3c7044728b	Enable Intel® VTune™ Profiler's Instrumentation and Tracing Technology APIs (ITT) to PyTorch (#63289 ) More detailed description of benefits can be found at #41001. This is Intel's counterpart of NVidia’s NVTX (https://pytorch.org/docs/stable/autograd.html#torch.autograd.profiler.emit_nvtx). ITT is a functionality for labeling trace data during application execution across different Intel tools. For integrating Intel(R) VTune Profiler into Kineto, ITT needs to be integrated into PyTorch first. It works with both standalone VTune Profiler [(https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html](https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html)) and Kineto-integrated VTune functionality in the future. It works for both Intel CPU and Intel XPU devices. Pitch Add VTune Profiler's ITT API function calls to annotate PyTorch ops, as well as developer customized code scopes on CPU, like NVTX for NVidia GPU. This PR rebases the code changes at https://github.com/pytorch/pytorch/pull/61335 to the latest master branch. Usage example: ``` with torch.autograd.profiler.emit_itt(): for i in range(10): torch.itt.range_push('step_{}'.format(i)) model(input) torch.itt.range_pop() ``` cc @ilia-cher @robieta @chaekit @gdankel @bitfort @ngimel @orionr @nbcsm @guotuofeng @guyang3532 @gaoteng-git Pull Request resolved: https://github.com/pytorch/pytorch/pull/63289 Approved by: https://github.com/malfet	2022-07-13 13:50:15 +00:00
PyTorch MergeBot	1454515253	Revert "Enable Intel® VTune™ Profiler's Instrumentation and Tracing Technology APIs (ITT) to PyTorch (#63289 )" This reverts commit `f988aa2b3f`. Reverted https://github.com/pytorch/pytorch/pull/63289 on behalf of https://github.com/malfet due to broke trunk, see `f988aa2b3f`	2022-06-30 12:49:41 +00:00
Jing Xu	f988aa2b3f	Enable Intel® VTune™ Profiler's Instrumentation and Tracing Technology APIs (ITT) to PyTorch (#63289 ) More detailed description of benefits can be found at #41001. This is Intel's counterpart of NVidia’s NVTX (https://pytorch.org/docs/stable/autograd.html#torch.autograd.profiler.emit_nvtx). ITT is a functionality for labeling trace data during application execution across different Intel tools. For integrating Intel(R) VTune Profiler into Kineto, ITT needs to be integrated into PyTorch first. It works with both standalone VTune Profiler [(https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html](https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html)) and Kineto-integrated VTune functionality in the future. It works for both Intel CPU and Intel XPU devices. Pitch Add VTune Profiler's ITT API function calls to annotate PyTorch ops, as well as developer customized code scopes on CPU, like NVTX for NVidia GPU. This PR rebases the code changes at https://github.com/pytorch/pytorch/pull/61335 to the latest master branch. Usage example: ``` with torch.autograd.profiler.emit_itt(): for i in range(10): torch.itt.range_push('step_{}'.format(i)) model(input) torch.itt.range_pop() ``` cc @ilia-cher @robieta @chaekit @gdankel @bitfort @ngimel @orionr @nbcsm @guotuofeng @guyang3532 @gaoteng-git Pull Request resolved: https://github.com/pytorch/pytorch/pull/63289 Approved by: https://github.com/malfet	2022-06-30 05:14:03 +00:00
Mengwei Liu	9ce9803abe	[PyTorch] Add codegen unboxing ability (#69881 ) Summary: RFC: https://github.com/pytorch/rfcs/pull/40 This PR (re)introduces python codegen for unboxing wrappers. Given an entry of `native_functions.yaml` the codegen should be able to generate the corresponding C++ code to convert ivalues from the stack to their proper types. To trigger the codegen, run ``` tools/jit/gen_unboxing.py -d cg/torch/share/ATen ``` Merged changes on CI test. In https://github.com/pytorch/pytorch/issues/71782 I added an e2e test for static dispatch + codegen unboxing. The test exports a mobile model of mobilenetv2, load and run it on a new binary for lite interpreter: `test/mobile/custom_build/lite_predictor.cpp`. ## Lite predictor build specifics 1. Codegen: `gen.py` generates `RegisterCPU.cpp` and `RegisterSchema.cpp`. Now with this PR, once `static_dispatch` mode is enabled, `gen.py` will not generate `TORCH_LIBRARY` API calls in those cpp files, hence avoids interaction with the dispatcher. Once `USE_LIGHTWEIGHT_DISPATCH` is turned on, `cmake/Codegen.cmake` calls `gen_unboxing.py` which generates `UnboxingFunctions.h`, `UnboxingFunctions_[0-4].cpp` and `RegisterCodegenUnboxedKernels_[0-4].cpp`. 2. Build: `USE_LIGHTWEIGHT_DISPATCH` adds generated sources into `all_cpu_cpp` in `aten/src/ATen/CMakeLists.txt`. All other files remain unchanged. In reality all the `Operators_[0-4].cpp` are not necessary but we can rely on linker to strip them off. ## Current CI job test coverage update Created a new CI job `linux-xenial-py3-clang5-mobile-lightweight-dispatch-build` that enables the following build options: * `USE_LIGHTWEIGHT_DISPATCH=1` * `BUILD_LITE_INTERPRETER=1` * `STATIC_DISPATCH_BACKEND=CPU` This job triggers `test/mobile/lightweight_dispatch/build.sh` and builds `libtorch`. Then the script runs C++ tests written in `test_lightweight_dispatch.cpp` and `test_codegen_unboxing.cpp`. Recent commits added tests to cover as many C++ argument type as possible: in `build.sh` we installed PyTorch Python API so that we can export test models in `tests_setup.py`. Then we run C++ test binary to run these models on lightweight dispatch enabled runtime. Pull Request resolved: https://github.com/pytorch/pytorch/pull/69881 Reviewed By: iseeyuan Differential Revision: D33692299 Pulled By: larryliu0820 fbshipit-source-id: 211e59f2364100703359b4a3d2ab48ca5155a023 (cherry picked from commit 58e1c9a25e3d1b5b656282cf3ac2f548d98d530b)	2022-03-01 23:28:13 +00:00
Eli Uriegas	0a8b391936	ci: Enable tests for iOS on GHA These were left out of the intial migration for some reason so this just transfers over those tests Signed-off-by: Eli Uriegas <eliuriegasfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/71644 Signed-off-by: Eli Uriegas <eliuriegas@fb.com>	2022-01-27 19:32:12 +00:00
Tao Xu	f5daa9f76b	[iOS] Enable ARC for CMake build (#67884 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67884 Test Plan: Imported from OSS Reviewed By: husthyc Differential Revision: D32191532 Pulled By: xta0 fbshipit-source-id: a295004f8e7f1b0f5a4ab12ffd9b37c36b80226b	2021-11-04 16:50:46 -07:00
Eli Uriegas	9e97ccbd7a	.github: Migrate iOS workflows to GHA (#67645 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67645 Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Test Plan: Imported from OSS Reviewed By: malfet Differential Revision: D32104367 Pulled By: seemethere fbshipit-source-id: 08ff043ed5d0b434322f1f3f20dce2a4f5fa88c1	2021-11-02 14:38:43 -07:00
Chen Lai	355acfdebc	[PyTorch Edge][tracing-based] use operator.yaml to build libtorch library (#66237 ) Summary: https://pxl.cl/1QK3N Enable using the yaml file from tracer to build libtorch library for ios and android. 1. Android: ``` SELECTED_OP_LIST=/Users/chenlai/Documents/pytorch/tracing/deeplabv3_scripted_tracing_update.yaml TRACING_BASED=1 ./scripts/build_pytorch_android.sh x86 ``` libtorch_lite.so x86: 3 MB (larger than H1, static is ~3.2 MB) 2. iOS ``` SELECTED_OP_LIST=/Users/chenlai/Documents/pytorch/tracing/deeplabv3_scripted_tracing_update.yaml TRACING_BASED=1 BUILD_PYTORCH_MOBILE=1 IOS_PLATFORM=SIMULATOR ./scripts/build_ios.sh ``` Binary size: 7.6 MB Size: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66237 ghstack-source-id: 140197164 Reviewed By: dhruvbird Differential Revision: D31463119 fbshipit-source-id: c3f4eb71bdef1969eab6cb60999fec8547641cbd	2021-10-10 14:07:01 -07:00
Tao Xu	18fa58c4e9	[CoreML][OSS] Integrate with CMake (#64523 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64523 - Build Pytorch with CoreML delegate - ` USE_PYTORCH_METAL=ON python setup.py install --cmake` - Build iOS static libs - `IOS_PLATFORM=SIMULATOR USE_COREML_DELEGATE=1 ./scripts/build_ios.sh` ghstack-source-id: 138324216 Test Plan: - Test the Helloword example {F657778559} Reviewed By: iseeyuan Differential Revision: D30594041 fbshipit-source-id: 8cece0b2d4b3ef82d3ef4da8c1054919148beb16	2021-09-17 10:32:00 -07:00
Kimish Patel	38c185189c	[Pytorch Edge] Enable kineto profiler on mobile via EdgeKinetoProfiler (#62419 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62419 This diff adds support for cpu only kineto profiler on mobile. Thus enabling chrome trace generation on mobile. This bring cpp API for mobile profiling on part with Torchscript. This is done via: 1. Utilizating debug handle annotations in KinetoEvent. 2. Adding post processing capability, via callbacks, to KinetoThreadLocalState 3. Creating new RAII stype profiler, KinetoEdgeCPUProfiler, which can be used in surrounding scope of model execution. This will write chrome trace to the location specified in profiler constructor. Test Plan: MobileProfiler.ModuleHierarchy Imported from OSS Reviewed By: raziel Differential Revision: D29993660 fbshipit-source-id: 0b44f52f9e9c5f5aff81ebbd9273c254c3c03299	2021-08-13 21:40:19 -07:00
Chen Lai	b5a834a739	[Pytorch] Build lite interpreter as default for iOS Summary: Two changes: 1. Build lite interpreter as default for iOS 2. Switch the previous lite interpreter test to full jit build test Test Plan: Imported from OSS Differential Revision: D27698039 Reviewed By: xta0 Pulled By: cccclai fbshipit-source-id: 022b554f4997ae577681f2b79a9ebe9236ca4f7d	2021-05-17 22:36:05 -07:00
davidriazati@fb.com	4b96fc060b	Remove distutils (#57040 ) Summary: [distutils](https://docs.python.org/3/library/distutils.html) is on its way out and will be deprecated-on-import for Python 3.10+ and removed in Python 3.12 (see [PEP 632](https://www.python.org/dev/peps/pep-0632/)). There's no reason for us to keep it around since all the functionality we want from it can be found in `setuptools` / `sysconfig`. `setuptools` includes a copy of most of `distutils` (which is fine to use according to the PEP), that it uses under the hood, so this PR also uses that in some places. Fixes #56527 Pull Request resolved: https://github.com/pytorch/pytorch/pull/57040 Pulled By: driazati Reviewed By: nikithamalgifb Differential Revision: D28051356 fbshipit-source-id: 1ca312219032540e755593e50da0c9e23c62d720	2021-04-29 12:10:11 -07:00
Chen Lai	14f7bf0629	[PyTorch] update CMake to build libtorch lite (#51419 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51419 ## Summary 1. Add an option `BUILD_LITE_INTERPRETER` in `caffe2/CMakeLists.txt` and set `OFF` as default. 2. Update 'build_android.sh' with an argument to swtich `BUILD_LITE_INTERPRETER`, 'OFF' as default. 3. Add a mini demo app `lite_interpreter_demo` linked with `libtorch` library, which can be used for quick test. ## Test Plan Built lite interpreter version of libtorch and test with Image Segmentation demo app ([android version](https://github.com/pytorch/android-demo-app/tree/master/ImageSegmentation)/[ios version](https://github.com/pytorch/ios-demo-app/tree/master/ImageSegmentation)) ### Android 1. Prepare model: Prepare the lite interpreter version of model by run the script below to generate the scripted model `deeplabv3_scripted.pt` and `deeplabv3_scripted.ptl` ``` import torch model = torch.hub.load('pytorch/vision:v0.7.0', 'deeplabv3_resnet50', pretrained=True) model.eval() scripted_module = torch.jit.script(model) # Export full jit version model (not compatible lite interpreter), leave it here for comparison scripted_module.save("deeplabv3_scripted.pt") # Export lite interpreter version model (compatible with lite interpreter) scripted_module._save_for_lite_interpreter("deeplabv3_scripted.ptl") ``` 2. Build libtorch lite for android: Build libtorch for android for all 4 android abis (armeabi-v7a, arm64-v8a, x86, x86_64) `BUILD_LITE_INTERPRETER=1 ./scripts/build_pytorch_android.sh`. This pr is tested on Pixel 4 emulator with x86, so use cmd `BUILD_LITE_INTERPRETER=1 ./scripts/build_pytorch_android.sh x86` to specify abi to save built time. After the build finish, it will show the library path: ``` ... BUILD SUCCESSFUL in 55s 134 actionable tasks: 22 executed, 112 up-to-date + find /Users/chenlai/pytorch/android -type f -name 'aar' + xargs ls -lah -rw-r--r-- 1 chenlai staff 13M Feb 11 11:48 /Users/chenlai/pytorch/android/pytorch_android/build/outputs/aar/pytorch_android-release.aar -rw-r--r-- 1 chenlai staff 36K Feb 9 16:45 /Users/chenlai/pytorch/android/pytorch_android_torchvision/build/outputs/aar/pytorch_android_torchvision-release.aar ``` 3. Use the PyTorch Android libraries built from source in the ImageSegmentation app: Create a folder 'libs' in the path, the path from repository root will be `ImageSegmentation/app/libs`. Copy `pytorch_android-release` to the path `ImageSegmentation/app/libs/pytorch_android-release.aar`. Copy 'pytorch_android_torchvision` (downloaded from [here](https://oss.sonatype.org/#nexus-search;quick~torchvision_android)) to the path `ImageSegmentation/app/libs/pytorch_android_torchvision.aar` Update the `dependencies` part of `ImageSegmentation/app/build.gradle` to ``` dependencies { implementation 'androidx.appcompat:appcompat:1.2.0' implementation 'androidx.constraintlayout:constraintlayout:2.0.2' testImplementation 'junit:junit:4.12' androidTestImplementation 'androidx.test.ext:junit:1.1.2' androidTestImplementation 'androidx.test.espresso:espresso-core:3.3.0' implementation(name:'pytorch_android-release', ext:'aar') implementation(name:'pytorch_android_torchvision', ext:'aar') implementation 'com.android.support:appcompat-v7:28.0.0' implementation 'com.facebook.fbjni:fbjni-java-only:0.0.3' } ``` Update `allprojects` part in `ImageSegmentation/build.gradle` to ``` allprojects { repositories { google() jcenter() flatDir { dirs 'libs' } } } ``` 4. Update model loader api: Update `ImageSegmentation/app/src/main/java/org/pytorch/imagesegmentation/MainActivity.java` by 4.1 Add new import: `import org.pytorch.LiteModuleLoader;` 4.2 Replace the way to load pytorch lite model ``` // mModule = Module.load(MainActivity.assetFilePath(getApplicationContext(), "deeplabv3_scripted.pt")); mModule = LiteModuleLoader.load(MainActivity.assetFilePath(getApplicationContext(), "deeplabv3_scripted.ptl")); ``` 5. Test app: Build and run the ImageSegmentation app in Android Studio, ![image](https://user-images.githubusercontent.com/16430979/107696279-9cea5900-6c66-11eb-8286-4d1d68abff61.png) ### iOS 1. Prepare model: Same as Android. 2. Build libtorch lite for ios* `BUILD_PYTORCH_MOBILE=1 IOS_PLATFORM=SIMULATOR BUILD_LITE_INTERPRETER=1 ./scripts/build_ios.sh` 3. Remove Cocoapods from the project: run `pod deintegrate` 4. Link ImageSegmentation demo app with the custom built library: Open your project in XCode, go to your project Target’s Build Phases - Link Binaries With Libraries, click the + sign and add all the library files located in `build_ios/install/lib`. Navigate to the project Build Settings, set the value Header Search Paths to `build_ios/install/include` and Library Search Paths to `build_ios/install/lib`. In the build settings, search for other linker flags. Add a custom linker flag below ``` -all_load ``` Finally, disable bitcode for your target by selecting the Build Settings, searching for Enable Bitcode, and set the value to No. 5. Update library and api 5.1 Update `TorchModule.mm`` To use the custom built libraries the project, replace `#import <LibTorch/LibTorch.h>` (in `TorchModule.mm`) which is needed when using LibTorch via Cocoapods with the code below: ``` //#import <LibTorch/LibTorch.h> #include "ATen/ATen.h" #include "caffe2/core/timer.h" #include "caffe2/utils/string_utils.h" #include "torch/csrc/autograd/grad_mode.h" #include "torch/script.h" #include <torch/csrc/jit/mobile/function.h> #include <torch/csrc/jit/mobile/import.h> #include <torch/csrc/jit/mobile/interpreter.h> #include <torch/csrc/jit/mobile/module.h> #include <torch/csrc/jit/mobile/observer.h> ``` 5.2 Update `ViewController.swift` ``` // if let filePath = Bundle.main.path(forResource: // "deeplabv3_scripted", ofType: "pt"), // let module = TorchModule(fileAtPath: filePath) { // return module // } else { // fatalError("Can't find the model file!") // } if let filePath = Bundle.main.path(forResource: "deeplabv3_scripted", ofType: "ptl"), let module = TorchModule(fileAtPath: filePath) { return module } else { fatalError("Can't find the model file!") } ``` ### Unit test Add `test/cpp/lite_interpreter`, with one unit test `test_cores.cpp` and a light model `sequence.ptl` to test `_load_for_mobile()`, `bc.find_method()` and `bc.forward()` functions. ### Size: With the change: Android: x86: `pytorch_android-release.aar` (13.8 MB) IOS: `pytorch/build_ios/install/lib` (lib: 66 MB): ``` (base) chenlai@chenlai-mp lib % ls -lh total 135016 -rw-r--r-- 1 chenlai staff 3.3M Feb 15 20:45 libXNNPACK.a -rw-r--r-- 1 chenlai staff 965K Feb 15 20:45 libc10.a -rw-r--r-- 1 chenlai staff 4.6K Feb 15 20:45 libclog.a -rw-r--r-- 1 chenlai staff 42K Feb 15 20:45 libcpuinfo.a -rw-r--r-- 1 chenlai staff 39K Feb 15 20:45 libcpuinfo_internals.a -rw-r--r-- 1 chenlai staff 1.5M Feb 15 20:45 libeigen_blas.a -rw-r--r-- 1 chenlai staff 148K Feb 15 20:45 libfmt.a -rw-r--r-- 1 chenlai staff 44K Feb 15 20:45 libpthreadpool.a -rw-r--r-- 1 chenlai staff 166K Feb 15 20:45 libpytorch_qnnpack.a -rw-r--r-- 1 chenlai staff 384B Feb 15 21:19 libtorch.a -rw-r--r-- 1 chenlai staff 60M Feb 15 20:47 libtorch_cpu.a ``` `pytorch/build_ios/install`: ``` (base) chenlai@chenlai-mp install % du -sh * 14M include 66M lib 2.8M share ``` Master (baseline): Android: x86: `pytorch_android-release.aar` (16.2 MB) IOS: `pytorch/build_ios/install/lib` (lib: 84 MB): ``` (base) chenlai@chenlai-mp lib % ls -lh total 172032 -rw-r--r-- 1 chenlai staff 3.3M Feb 17 22:18 libXNNPACK.a -rw-r--r-- 1 chenlai staff 969K Feb 17 22:18 libc10.a -rw-r--r-- 1 chenlai staff 4.6K Feb 17 22:18 libclog.a -rw-r--r-- 1 chenlai staff 42K Feb 17 22:18 libcpuinfo.a -rw-r--r-- 1 chenlai staff 1.5M Feb 17 22:18 libeigen_blas.a -rw-r--r-- 1 chenlai staff 44K Feb 17 22:18 libpthreadpool.a -rw-r--r-- 1 chenlai staff 166K Feb 17 22:18 libpytorch_qnnpack.a -rw-r--r-- 1 chenlai staff 384B Feb 17 22:19 libtorch.a -rw-r--r-- 1 chenlai staff 78M Feb 17 22:19 libtorch_cpu.a ``` `pytorch/build_ios/install`: ``` (base) chenlai@chenlai-mp install % du -sh * 14M include 84M lib 2.8M share ``` Test Plan: Imported from OSS Reviewed By: iseeyuan Differential Revision: D26518778 Pulled By: cccclai fbshipit-source-id: 4503ffa1f150ecc309ed39fb0549e8bd046a3f9c	2021-02-21 01:43:54 -08:00
Tao Xu	bf1ea14fbc	[CI][IOS] Add a arm64 ios job for Metal (#46646 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46646 Test Plan: Imported from OSS Reviewed By: seemethere, linbinyu Differential Revision: D24459597 Pulled By: xta0 fbshipit-source-id: e93a3a26897614c66768804c71658928cd26ede7	2020-10-22 16:54:46 -07:00
Tao Xu	04e5fcc0ed	[GPU] Introduce USE_PYTORCH_METAL (#46383 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46383 The old `USE_METAL` is actually being used by Caffe2. Here we introduce a new macro to enable metal in pytorch. ghstack-source-id: 114499392 Test Plan: - Circle CI - The Person Segmentation model works Reviewed By: linbinyu Differential Revision: D24322018 fbshipit-source-id: 4e5548afba426b49f314366d89b18ba0c7e745ca	2020-10-16 18:19:32 -07:00
Tao Xu	a277c097ac	[iOS][GPU] Add Metal/MPSCNN support on iOS (#46112 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46112 ### Summary This PR adds the support of running torchscript models on iOS GPU via Metal (Inference only). The feature is currently in prototype state, API changes are expected. The tutorial and the documents will be added once it goes to beta. allow-large-files - Users API ``` auto module = torch::jit::load(model); module.eval(); at::Tensor input = at::ones({1,3,224,224}, at::ScalarType::Float).metal(); auto output = module.forward({input}).toTensor().cpu(); ``` - Supported Models - Person Segmentation v106 (FB Internal) - Mobilenetv2 - Supported Operators - aten::conv2d - aten::addmm - aten::add.Tensor - aten::sub.Tensor - aten::mul.Tensor - aten::relu - aten::hardtanh - aten::hardtanh_ - aten::sigmoid - aten::max_pool2d - aten::adaptive_avg_pool2d - aten::reshape - aten::t - aten::view - aten::log_softmax.int - aten::upsample_nearest2d.vec - Supported Devices - Apple A9 and above - iOS 10.2 and above - CMake scripts - `IOS_ARCH=arm64 ./scripts/build_ios.sh -DUSE_METAL=ON` ### Test Plan - Circle CI ghstack-source-id: 114155638 Test Plan: 1. Sandcastle CI 2. Circle CI Reviewed By: dreiss Differential Revision: D23236555 fbshipit-source-id: 98ffc48b837e308bc678c37a9a5fd8ae72d11625	2020-10-13 01:46:56 -07:00
Jiakai Liu	3a0e35c9f2	[pytorch] deprecate static dispatch (#43564 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43564 Static dispatch was originally introduced for mobile selective build. Since we have added selective build support for dynamic dispatch and tested it in FB production for months, we can deprecate static dispatch to reduce the complexity of the codebase. Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D23324452 Pulled By: ljk53 fbshipit-source-id: d2970257616a8c6337f90249076fca1ae93090c7	2020-08-27 14:52:48 -07:00
Tao Xu	6de6041585	[iOS] Disable NNPACK on iOS builds (#39868 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39868 ### Summary why disable NNPACK on iOS - To stay consistency with our internal version - It's currently blocking some external users due to its lack support of x86 architecture - https://github.com/pytorch/pytorch/issues/32040 - https://discuss.pytorch.org/t/undefined-symbols-for-architecture-x86-64-for-libtorch-in-swift-unit-test/84552/6 - NNPACK uses fast convolution algorithms (FFT, winograd) to reduce the computational complexity of convolutions with large kernel size. The algorithmic speedup is limited to specific conv params which are unlikely to appear in mobile networks. - Since XNNPACK has been enabled, it performs much better than NNPACK on depthwise-separable convolutions which is the algorithm being used by most of mobile computer vision networks. ### Test Plan - CI Checks Test Plan: Imported from OSS Differential Revision: D22087365 Pulled By: xta0 fbshipit-source-id: 89a959b0736c1f8703eff10723a8fbd02357fd4a	2020-06-17 01:39:56 -07:00
Gemfield	70f3298684	Fix SELECTED_OP_LIST file path issue (#33942 ) Summary: If SELECTED_OP_LIST is specified as a relative path in command line, CMake build will fail. Pull Request resolved: https://github.com/pytorch/pytorch/pull/33942 Differential Revision: D20392797 Pulled By: ljk53 fbshipit-source-id: dffeebc48050970e286cf263bdde8b26d8fe4bce	2020-03-11 13:19:31 -07:00
Jiakai Liu	9a5e9d8cec	[pytorch][mobile] change mobile build scripts to build PyTorch by default (#34203 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34203 Currently cmake and mobile build scripts still build libcaffe2 by default. To build pytorch mobile users have to set environment variable BUILD_PYTORCH_MOBILE=1 or set cmake option BUILD_CAFFE2_MOBILE=OFF. PyTorch mobile has been released for a while. It's about time to change CMake and build scripts to build libtorch by default. Changed caffe2 CI job to build libcaffe2 by setting BUILD_CAFFE2_MOBILE=1 environment variable. Only found android CI for libcaffe2 - do we ever have iOS CI for libcaffe2? Test Plan: Imported from OSS Differential Revision: D20267274 Pulled By: ljk53 fbshipit-source-id: 9d997032a599c874d62fbcfc4f5d4fbf8323a12e	2020-03-05 23:40:47 -08:00
Tao Xu	9c0625b004	[iOS] Add watchOS support (#33318 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33318 ### Summary Recently, we have a [discussion](https://discuss.pytorch.org/t/libtorch-on-watchos/69073/14) in the forum about watchOS. This PR adds the support for building watchOS libraries. ### Test Plan - `BUILD_PYTORCH_MOBILE=1 IOS_PLATFORM=WATCHOS ./scripts/build_ios.sh` Test Plan: Imported from OSS Differential Revision: D19896534 Pulled By: xta0 fbshipit-source-id: 7b9286475e895d9fefd998246e7090ac92c4c9b6	2020-02-14 14:02:22 -08:00
Jiakai Liu	43fb0015db	custom build script (#30144 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30144 Create script to produce libtorch that only contains ops needed by specific models. Developers can use this workflow to further optimize mobile build size. Need keep a dummy stub for unused (stripped) ops because some JIT side logic requires certain function schemas to be existed in the JIT op registry. Test Steps: 1. Build "dump_operator_names" binary and use it to dump root ops needed by a specific model: ``` build/bin/dump_operator_names --model=mobilenetv2.pk --output=mobilenetv2.yaml ``` 2. The MobileNetV2 model should use the following ops: ``` - aten::t - aten::dropout - aten::mean.dim - aten::add.Tensor - prim::ListConstruct - aten::addmm - aten::_convolution - aten::batch_norm - aten::hardtanh_ - aten::mm ``` NOTE that for some reason it outputs "aten::addmm" but actually uses "aten::mm". You need fix it manually for now. 3. Run custom build script locally (use Android as an example): ``` SELECTED_OP_LIST=mobilenetv2.yaml scripts/build_pytorch_android.sh armeabi-v7a ``` 4. Checkout demo app that uses locally built library instead of downloading from jcenter repo: ``` git clone --single-branch --branch custom_build git@github.com:ljk53/android-demo-app.git ``` 5. Copy locally built libraries to demo app folder: ``` find ${HOME}/src/pytorch/android -name '*.aar' -exec cp {} ${HOME}/src/android-demo-app/HelloWorldApp/app/libs/ \; ``` 6. Build demo app with locally built libtorch: ``` cd ${HOME}/src/android-demo-app/HelloWorldApp ./gradlew clean && ./gradlew assembleDebug ``` 7. Install and run the demo app. In-APK arm-v7 libpytorch_jni.so build size reduced from 5.5M to 2.9M. Test Plan: Imported from OSS Differential Revision: D18612127 Pulled By: ljk53 fbshipit-source-id: fa8d5e1d3259143c7346abd1c862773be8c7e29a	2019-11-20 13:16:02 -08:00
Jiakai Liu	9371b31818	set USE_STATIC_DISPATCH outside cmake (#29715 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29715 Previous we hard code it to enable static dispatch when building mobile library. Since we are exploring approaches to deprecate static dispatch we should make it optional. This PR moved the setting from cmake to bash build scripts which can be overridden. Test Plan: - verified it's still using static dispatch when building with these scripts. Differential Revision: D18474640 Pulled By: ljk53 fbshipit-source-id: 7591acc22009bfba36302e3b2a330b1428d8e3f1	2019-11-14 20:41:29 -08:00
Ashkan Aliabadi	1345dabb1d	Only set CCACHE_WRAPPER_PATH in the build scripts if it is not already passed in. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29002 Test Plan: Imported from OSS Differential Revision: D18277225 Pulled By: AshkanAliabadi fbshipit-source-id: eb70607790754cd5d214133967404242c05dd5d5	2019-11-01 18:39:12 -07:00
Tao Xu	5f2c320840	Disable bitcode for iOS CI jobs (#26478 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26478 ### Summary Since QNNPACK [doesn't support bitcode](`7d2a4e9931/scripts/build-ios-arm64.sh (L40)`), I'm going to disable it in our CMake scripts. This won't hurt any existing functionalities, and will only affect the build size. Any application that wants to integrate our framework should turn off bitcode as well. ### Test plan - CI job works - LibTorch.a can be compiled and run on iOS devices Test Plan: Imported from OSS Differential Revision: D17489020 Pulled By: xta0 fbshipit-source-id: 950619b9317036cad0505d8a531fb8f5331dc81f	2019-09-19 15:38:57 -07:00
Jiakai Liu	16c1907830	update build_android.sh to not build host protoc for libtorch (#25896 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25896 Similar change as PR #25822. Test Plan: - Updated CI to use the new script. - Will check pytorch android CI output to make sure it builds libtorch instead of libcaffe2. Reviewed By: dreiss Differential Revision: D17279722 Pulled By: ljk53 fbshipit-source-id: 93abcef0dfb93df197fabff29e53d71db5674255	2019-09-10 15:19:43 -07:00
Tao Xu	001ba1c504	Clean up the iOS build script (#25822 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25822 ### Summary Since protobuf has been removed from mobile, the `build_host_protoc.sh` can be removed from `build_ios.sh` as well. However, the old caffe2 mobile build still depend on it, therefore, I introduced this `BUILD_PYTORCH_MOBILE` flag to gate the build. - iOS device build ``` BUILD_PYTORCH_MOBILE=1 IOS_ARCH=arm64 ./scripts/build_ios.sh BUILD_PYTORCH_MOBILE=1 IOS_ARCH=armv7s ./scripts/build_ios.sh ``` - iOS simulator build ``` BUILD_PYTORCH_MOBILE=1 IOS_PLATFORM=SIMULATOR ./scripts/build_ios.sh ``` ### Test Plan All device and simulator builds run successfully Test Plan: Imported from OSS Differential Revision: D17264469 Pulled By: xta0 fbshipit-source-id: f8994bbefec31b74044eaf01214ae6df797816c3	2019-09-09 11:59:50 -07:00
Tao Xu	514285890c	Enable QNNPACK for iOS (#24030 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24030 The cmake arg - `USE_QNNPACK` was disabled for iOS build due to its lack of support for building multiple archs(armv7;armv7s;arm64) simultaneously.To enable it, we need to specify the value of IOS_ARCH explicitly in the build command: ``` ./scripts/build_ios.sh \ -DIOS_ARCH=arm64 \ -DBUILD_CAFFE2_MOBILE=OFF \ ``` However,the iOS.cmake will overwirte this value according to the value of `IOS_PLATFORM`. This PR is a fix to this problem. Test Plan: - `USE_QNNPACK` should be turned on by cmake. - `libqnnpack.a` can be generated successfully. - `libortch.a` can be compiled and run successfully on iOS devices. <img src="https://github.com/xta0/AICamera-ObjC/blob/master/aicamera.gif?raw=true" width="400"> Differential Revision: D16771014 Pulled By: xta0 fbshipit-source-id: 4cdfd502cb2bcd29611e4c22e2efdcdfe9c920d3	2019-08-13 21:10:59 -07:00
Tao Xu	4c6c9ffaf8	Move iOS.cmake to the cmake folder (#24029 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24029 The cmake toolchain file for building iOS is currently in `/third-pary/ios-cmake`. Since the upstream is not active anymore, It's better to maintain this file ourselves moving forward.This PR is also the prerequisite for enabling QNNPACK for iOS. Test Plan: - The `libtorch.a` can be generated successfully - The `libtorch.a` can be compiled and run on iOS devices <img src="https://github.com/xta0/AICamera-ObjC/blob/master/aicamera.gif?raw=true" width="400"> Differential Revision: D16770980 Pulled By: xta0 fbshipit-source-id: 1ed7b12b3699bac52b74183fa7583180bb17567e	2019-08-12 14:17:28 -07:00
Karl Ostmo	8f0603b128	C++ changes toward libtorch and libcaffe2 unification (#19554 ) Summary: * adds TORCH_API and AT_CUDA_API in places * refactor code generation Python logic to separate caffe2/torch outputs * fix hip and asan * remove profiler_cuda from hip * fix gcc warnings for enums * Fix PythonOp::Kind Pull Request resolved: https://github.com/pytorch/pytorch/pull/19554 Differential Revision: D15082727 Pulled By: kostmo fbshipit-source-id: 83a8a99717f025ab44b29608848928d76b3147a4	2019-04-26 01:38:10 -07:00
Gemfield	1c3428af31	Enhance build_ios.sh to be consistent with build_android.sh (#18564 ) Summary: 1, Enhance build_ios.sh to be consistent with build_android.sh; 2, Add docs for build_ios.sh. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18564 Differential Revision: D14680752 Pulled By: soumith fbshipit-source-id: 6d2667ed8a3c85a057a522838f5d0461dd4788cf	2019-03-28 21:37:55 -07:00
Marat Dukhan	351478439f	Disable QNNPACK for multi-architecture iOS builds (#14125 ) Summary: QNNPACK contains assembly files, and CMake tries to build them for wrong architectures in multi-arch builds. This patch has two effects: - Disables QNNPACK in multi-arch iOS builds - Specifies a single `IOS_ARCH=arm64` by default (covers most iPhones/iPads on the market) Pull Request resolved: https://github.com/pytorch/pytorch/pull/14125 Differential Revision: D13112366 Pulled By: Maratyszcza fbshipit-source-id: b369083045b440e41d506667a92e41139c11a971	2018-11-16 21:18:01 -08:00
Yangqing Jia	73ed0d5ced	Modernizing the gflags dependency in cmake. Summary: Historically, for interface dependent libraries (glog, gflags and protobuf), exposing them in Caffe2Config.cmake is usually difficult. New versions of glog and gflags ship with new-style cmake targets, so one does not need to use variables. New-style targets also make it easier for people to depend on them in installed config files. This diff modernizes the gflags library, and still provides a fallback path if the installed gflags does not have cmake config files coming with it. It does change one behavior of the build process though - when one specifies -DUSE_GFLAGS=ON but gflags cannot be found, the old script automatically turns it off but the new script crashes, forcing the user to specify USE_GFLAGS=OFF. Closes https://github.com/caffe2/caffe2/pull/1819 Differential Revision: D6826604 Pulled By: Yangqing fbshipit-source-id: 210f3926f291c8bfeb24eb9671e5adfcbf8cf7fe	2018-01-27 19:31:14 -08:00
Junjie Bai	303ed8af44	Allow specifying cmake build directory in the build scripts Summary: Closes https://github.com/caffe2/caffe2/pull/1496 Reviewed By: pietern Differential Revision: D6379743 Pulled By: bddppq fbshipit-source-id: 1cb2238e5708547767729de3ac1d3e1a76ed5ba1	2017-11-20 20:32:30 -08:00
Pieter Noordhuis	39f0859749	Use ccache for macOS builds if present Summary: Closes https://github.com/caffe2/caffe2/pull/1475 Reviewed By: Yangqing Differential Revision: D6340034 Pulled By: pietern fbshipit-source-id: a932b8b2fd6f94215162b1f15f8f3ea640f542be	2017-11-15 14:38:36 -08:00
Pieter Noordhuis	9575364d30	Update protobuf detection Summary: The scripts/build_local.sh script would always build protoc from the third_party protobuf tree and override the PROTOBUF_PROTOC_EXECUTABLE CMake variable. This variable is used by the protobuf CMake files, so it doesn't let us detect whether the protoc was specified by the user or by the protobuf CMake files (e.g. an existing installation). This in turn led to a problem where system installed headers would be picked up while using protoc built from third_party. This only works if the system installed version matches the version included in the Caffe2 tree. Therefore, this commit changes the variable to specify a custom protoc executable to CAFFE2_CUSTOM_PROTOC_EXECUTABLE, and forces the use of the bundled libprotobuf when it is specified. The result is that we now EITHER specify a custom protoc (as required for cross-compilation where protoc must be compiled for the host and libprotobuf for the target architecture) and use libprotobuf from the Caffe2 tree, OR use system protobuf. If system protobuf cannot be found, we fall back to building protoc and libprotobuf in tree and packaging it as part of the Caffe2 build artifacts. Closes https://github.com/caffe2/caffe2/pull/1328 Differential Revision: D6032836 Pulled By: pietern fbshipit-source-id: b75f8dd88412f02c947dc81ca43f7b2788da51e5	2017-10-12 11:48:50 -07:00
Yangqing Jia	93e12e75df	Allow caffe2 to detect if cuda lib has been linked, and also fix oss build error. Summary: Closes https://github.com/caffe2/caffe2/pull/1114 Reviewed By: pietern Differential Revision: D5686557 Pulled By: Yangqing fbshipit-source-id: 6b7245ebbe4eeb025ce9d0fe8fda427a0c3d9770	2017-08-23 18:41:15 -07:00
Luke Yeager	fda35fd19d	TravisCI Overhaul Summary: Uncached build: https://travis-ci.org/lukeyeager/caffe2/builds/239677224 Cached build: https://travis-ci.org/lukeyeager/caffe2/builds/239686725 * Parallel builds everywhere * All builds use CCache for quick build times (help from https://github.com/pytorch/pytorch/pull/614, https://github.com/ccache/ccache/pull/145) * Run ctests when available (continuation of https://github.com/caffe2/caffe2/pull/550) * Upgraded from cuDNN v5 to v6 * Fixed MKL build (by updating pkg version) * Fixed android builds (`b6f905a67b (commitcomment-22404119)`) * ~~Building NNPACK fails with no discernible error message (currently disabled entirely)~~ * ~~Android builds continue to fail with existing error:~~ * ~~OSX builds time-out:~~ \| Before \| After \| Changes \| \| --- \| --- \| --- \| \| COMPILER=g++ \| linux \| without CUDA \| \| COMPILER=g++-5 \| linux-gcc5 \| without CUDA \| \| COMPILER=g++ \| linux-cuda \| updated to cuDNN v6 \| \| BLAS=MKL \| linux-mkl \| updated pkg version \| \| BUILD_TARGET=android \| linux-android \| \| \| COMPILER=clang++ \| osx \| \| \| BUILD_TARGET=ios \| osx-ios \| \| \| BUILD_TARGET=android \| osx-android \| \| \| QUICKTEST \| GONE \| \| \| COMPILER=g++-4.8 \| GONE \| \| \| COMPILER=g++-4.9 \| GONE \| \| Closes https://github.com/caffe2/caffe2/pull/735 Reviewed By: Yangqing Differential Revision: D5228966 Pulled By: bwasti fbshipit-source-id: 6cfa6f5ff05fbd5c2078beea79564f1f3b9812fe	2017-06-16 10:18:05 -07:00
Anatoly Rosencrantz	1040b5f91c	Enable bitcode for iOS builds Summary: build_ios.sh now have `-fembed-bitcode` flags for cmake and passes these flags to build_host_protoc.sh (which now accepts optional argument `--other-flags`). That allows to use output libs (libCaffe2_CPU.a, libCAFFE2_NNPACK.a, libCAFFE2_PTHREADPOOL.a and libprotobuf-lite.a, libprotobuf.a respectively) in Xcode projects with bitcode enabled. Bitcode is enabled by default in all projects since Xcode7, is crucial for slicing and is mandatory for watchOS targets. Enabling bitcode for target requires bitcode to be enabled for all dependencies also, so Caffe2 built without bitcode forces developers to switch off bitcode for the whole app. Closes https://github.com/caffe2/caffe2/pull/457 Reviewed By: bwasti Differential Revision: D4978644 Pulled By: Yangqing fbshipit-source-id: 5165abb507fb91bc8c38f7348d6836bccf8fcc22	2017-05-01 10:32:11 -07:00
Nay Oo	884690adb3	build_ios.sh comments fixes Summary: Changed _Android_ to _iOS_ in the comments in scripts/build_ios.sh. Closes https://github.com/caffe2/caffe2/pull/364 Differential Revision: D4930101 Pulled By: Yangqing fbshipit-source-id: 8f0a6aa1b43fd57c2f71f1c667c61d1f69b1e061	2017-04-21 10:52:29 -07:00
Yangqing Jia	9f86de2dc7	Support WatchOS build Summary: To build, run `IOS_PLATFORM=WATCHOS scripts/build_ios.sh` Closes https://github.com/caffe2/caffe2/pull/321 Reviewed By: Yangqing Differential Revision: D4923400 Pulled By: salexspb fbshipit-source-id: 3a87f068562a01e972ea915c9be32f0667e8ea19	2017-04-20 18:15:47 -07:00
Yangqing Jia	22f3825d8f	Cmake mobile build improvements Summary: (1) integrate gcc compatible nnpack (2) speed up the ios travis ci. Closes https://github.com/caffe2/caffe2/pull/268 Differential Revision: D4897576 Pulled By: Yangqing fbshipit-source-id: 729fa2e4b5be6f1d0b8d55305f047116969ff61f	2017-04-16 16:46:58 -07:00
Yangqing Jia	3a82b33f84	Use protobuf's own cmake scripts and add travis for ios Summary: Closes https://github.com/caffe2/caffe2/pull/110 Differential Revision: D4475170 Pulled By: Yangqing fbshipit-source-id: 5964db04186619ac563f516cb202c5e2ba543403	2017-01-28 13:29:32 -08:00

47 Commits