pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Zhengxu Chen	d6b15bfcbd	[jit][edge] Load interface methods to corresponding ClassTypes. (#65971 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65971 ghstack-source-id: 141842335 We should be able to load methods into their ClassTypes. Right now mobile runtime only loads data member to ClassTypes but not for methods. To support interface call, we inject methods into ClassTypes when the methods are loaded. Test Plan: existing tests should all pass. Reviewed By: qihqi Differential Revision: D31326146 fbshipit-source-id: fb1dbea619910ef1f8fa26146da3ebab348fe902	2021-10-29 12:48:57 -07:00
Scott Wolchok	e88d1c4f10	[PyTorch] Add tuple inline storage (#64066 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64066 I noticed a bunch of time being spent heap-allocating Tuples in the unpickler. 1-, 2-, and 3-element Tuples are apparently common enough that they get their own bytecode instructions, so I decided to try also giving them their own representation. We store up to 3 IValues inline in `Tuple` rather than doing a second heap allocation for a `std::vector<IValue>`. ghstack-source-id: 140695395 Test Plan: Added automated tests for TupleElements. Pixel 3 before: https://www.internalfb.com/intern/aibench/details/761596366576284 Pixel 3 after: https://www.internalfb.com/intern/aibench/details/591414145082422 We went from 347 ms to 302 ms. Reviewed By: dhruvbird Differential Revision: D30592622 fbshipit-source-id: 93625c54c9dca5f765ef6d5c191944179cb281a8	2021-10-15 12:16:51 -07:00
Mengwei Liu	ab25516054	[PyTorch] Remove unused function in import (#65865 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65865 `operator_str` is not used in `import.cpp` and it is also defined in `parse_operators.cpp` so removing it from `import.cpp`. Test Plan: CI passing Reviewed By: iseeyuan Differential Revision: D31293008 fbshipit-source-id: 1c857cbd63c57b8f79c1a068789fc8605605b642	2021-10-06 06:34:51 -07:00
Scott Wolchok	176d3c6fb4	[PyTorch] Fix many Tuple::elements() callsites (#64065 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64065 It is only safe to mutate Tuple elements if you are the sole owner of the tuple. The most efficient way to do this, then, is `std::move(*std::move(tupleIValue).toTuple()).elements()` (the innermost move allows `IValue::toTuple()` to avoid a refcount bump and the outermost move allows the element vector to be moved out of the tuple), but many callsites write simply `tupleIValue.toTuple().elements()`, which incurs many extra refcount bumps. ghstack-source-id: 139468088 Test Plan: CI Reviewed By: ezyang Differential Revision: D30592621 fbshipit-source-id: e8312de866de09b9ea2a62e5128cbf403ee16f09	2021-10-01 11:36:05 -07:00
Scott Wolchok	38c77539e8	[PyTorch][Edge] Fix inefficiency in objLoaderMobile (#65710 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65710 No need to incur extra refcount bumps, and no need to use a stringstream for what are presumably string keys anyway. ghstack-source-id: 139325445 Test Plan: CI, reviewers to confirm the keys are supposed to be strings Reviewed By: dhruvbird Differential Revision: D31215347 fbshipit-source-id: 82be93cb2e57aefe94edf74d149115cb734112be	2021-09-30 14:53:40 -07:00
Mengwei Liu	eaf85fad62	[PyTorch] Extract parseOperator() into a standalone source file (#65179 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65179 This is following up this PR: https://github.com/pytorch/pytorch/pull/61862. The purpose is to modularize operator parsing so that it can be used as needed without pulling the whole `import.cpp` into build. Test Plan: Added a unit test in `test_lite_predictor.cpp` called `ParseOperators`, similar to `ParseBytecode`. Reviewed By: iseeyuan Differential Revision: D31006555 fbshipit-source-id: c38e221800af4cf72963a353c452c5437f56a0ac	2021-09-17 13:31:59 -07:00
Salil Desai	3727baea6f	[PyTorch Edge][Model Loading] Operator Call De-dup at TorchScript Serialization Level [2/2] (#64269 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64269 Revert changes in D29826210 (`693d8f2f07`) (we don't need operator lambda caching since there aren't duplicate operators anymore) This diff stack results in an additional approx 12% speedup in model loading time (from 229ms to 200ms) when run against an 87MB speech model that jiatongzhou provided. ghstack-source-id: 138014904 Test Plan: Speech Transducer v25 model (as in D29826210 (`693d8f2f07`)) \|\| Before \| After \| \|Load Time\|[229ms](https://www.internalfb.com/intern/aibench/details/160889436133243)\|[200ms](https://www.internalfb.com/intern/aibench/details/837884532607514)\| \|Save File Size\|[86.23 MB](https://lookaside.facebook.com/intern/diff/file/data/?number=658544950)\|[86.1 MB](https://lookaside.facebook.com/intern/diff/file/data/?number=658554403)\| The "after" flamegraph shows significantly less time is spent on ```append_operator``` than before. Steps - Check out desired commit in devserver (base branch or this diff) - ```buck build bento/kernels:bento_kernel_pytorch``` - Use N1094068 with pytorch_local kernel to save model for lite interpreter - Edit ```aibench/specifications/models/pytorch/speech_transducer/v25.json ``` to have new model location and md5 - ```buck run aibench:run_bench -- -b aibench/specifications/models/pytorch/speech_transducer/v25.json --framework pytorch --platform android/arm64 --devices "S8US" --force_profile --remote ``` Test that saving a model with de-dup ops doesn't change its output https://www.internalfb.com/intern/anp/view/?id=1137434 Reviewed By: iseeyuan Differential Revision: D30615710 fbshipit-source-id: bb4052f0f16eccab386585e94411056f94bce43c	2021-09-14 12:12:46 -07:00
Martin Yuan	30a7c768d7	[RFC] Modularize functions of parsing bytecode (#61862 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61862 Modularize functions of parsing bytecode tables so that they can be used as needed in situations other than mobile lite interpreter. * The decoupled functions are re-used by current lite interpreter loader. * The bytecode can be serialized/deserialized from other formats. * The decoupled functions have minimum dependencies on other PyTorch components. Next: Build a driver binary to include the parser and interpreter, but only has necessary dependency on other PyTorch components. ghstack-source-id: 137867287 Test Plan: As an example, a simple bytecode is parsed to a mobile function, and directly run in the added unit test, `RunTimeTest:ParseBytecode`. It contains basic control flow (if, else) and basic data orchestration (list construction). CI Reviewed By: larryliu0820 Differential Revision: D29798382 Pulled By: iseeyuan fbshipit-source-id: 1c173a5f5d37097e3a97baec3f3e48e1eea1400f	2021-09-11 22:24:05 -07:00
Scott Wolchok	0d0d2f2ac5	[PyTorch] move from input ivalues in ByteCodeDeserializer (#64029 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64029 This should save us a separate pass over the data structure to destroy it. ghstack-source-id: 137566821 Test Plan: Pixel3 before: https://www.internalfb.com/intern/aibench/details/503337445067962 after: https://our.intern.facebook.com/intern/aibench/details/320277034999340 overall mean time decreased from 373 ms to 358 ms. In flame graph, we can see that some time spent destroying a vector of IValues was moved into parseMethods, and the new parseMethods time is less than the old time plus the recursive destruction time. Reviewed By: dhruvbird Differential Revision: D30559530 fbshipit-source-id: d080295a846745ea03ac50f08f4f6c95f4eaf3d8	2021-09-08 18:32:48 -07:00
Kimish Patel	468001600c	Back out "Revert D30327514: [Pytorch lite predictor] Use KinetoEdgeCPUProfiler for operator profiling." (#64307 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64307 Original commit changeset: 0b2aa7c57d08 Restores original changes. This diff changes the way operator profiling is done in lite predictor benchmarking binary. Instead of using custom callbacks it uses KinetoEdgeCPUProfiler to profile events and then generate operator level metric from it. Since KinetoEvents do not contain cpu clock time, now we report only wallclock time. This unifies various profiling effort that we have for benchmarking purpose. In production we will still use observer based mechanism, but the advantage of using kineto profiler is that we get few other things for free, such as: chrome trace generation. operator level memory profiling (to be added) flop counts (to be added) Furthermore possible we can use python post processing script to parse chrome trace and generate output similar to torch.profiler. (To be done) Furthermore removes some tests from test_lite_interpreter.cpp which were testing module hierarchy in debug info. They should be covered by test_mobile_profiler.cpp. Test Plan: aibench run Model without debug info: https://www.internalfb.com/intern/aibench/details/219598441154763 Model with debug info and --print_module_info true (see Operator summary has now module hierarchy information). https://www.internalfb.com/intern/aibench/details/617154236292985 Reviewed By: raziel Differential Revision: D30680354 fbshipit-source-id: b6ba0d59c510c13d13d9935b1d8051cc82ffa4e9	2021-09-01 13:29:35 -07:00
Kimish Patel	67cb131458	Revert D30327514: [Pytorch lite predictor] Use KinetoEdgeCPUProfiler for operator profiling. Test Plan: revert-hammer Differential Revision: D30327514 (`bc9277dca3`) Original commit changeset: 3bb2f2daaaed fbshipit-source-id: 0b2aa7c57d08de77c9aaa75e546a7d0938610f64	2021-08-31 08:30:36 -07:00
Kimish Patel	bc9277dca3	[Pytorch lite predictor] Use KinetoEdgeCPUProfiler for operator profiling. (#63367 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63367 This diff changes the way operator profiling is done in lite predictor benchmarking binary. Instead of using custom callbacks it uses KinetoEdgeCPUProfiler to profile events and then generate operator level metric from it. Since KinetoEvents do not contain cpu clock time, now we report only wallclock time. This unifies various profiling effort that we have for benchmarking purpose. In production we will still use observer based mechanism, but the advantage of using kineto profiler is that we get few other things for free, such as: - chrome trace generation. - operator level memory profiling (to be added) - flop counts (to be added) Furthermore possible we can use python post processing script to parse chrome trace and generate output similar to torch.profiler. (To be done) Test Plan: aibench run Model without debug info: https://www.internalfb.com/intern/aibench/details/219598441154763 Model with debug info and `--print_module_info true` (see Operator summary has now module hierarchy information). https://www.internalfb.com/intern/aibench/details/617154236292985 Reviewed By: raziel Differential Revision: D30327514 fbshipit-source-id: 3bb2f2daaaedfb04bd6f5d9c91292783f9c4344f	2021-08-30 20:54:51 -07:00
Scott Wolchok	9777887f0e	[PyTorch] Reduce copies/refcount bumps in BytecodeDeserializer::parseMethods (#63961 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63961 Saw a report that this function was slow and was doing unexplained vector copies. First pass to remove a bunch of copying. ghstack-source-id: 136760976 Test Plan: Pixel 3 before: https://our.intern.facebook.com/intern/aibench/details/461850118893980 after: https://www.internalfb.com/intern/aibench/details/48965886029524 MilanBoard failed to return data from simpleperf Reviewed By: dhruvbird Differential Revision: D30544551 fbshipit-source-id: 0e2b5471a10c0803d52c923e6fb5625f5542b99d	2021-08-30 09:37:10 -07:00
Priya Ramani	f4496528e3	[Light] Fix error message (#64010 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64010 Fixing typos in a error message Test Plan: Error message before fix: Lite Interpreter verson number does not match. The model version must be between 3 and 5But the model version is 6 Error message after fix: Lite Interpreter version number does not match. The model version must be between 3 and 5 but the model version is 6 Reviewed By: larryliu0820 Differential Revision: D30568367 fbshipit-source-id: 205f3278ee8dcf38579dbb828580a9e986ccacc1	2021-08-27 22:54:38 -07:00
Dhruv Matani	693d8f2f07	[PyTorch Edge] Cache operator lambda during model loading [7% faster model loading] (#61996 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61996 A recent post https://fb.workplace.com/groups/pytorch.edge.users/posts/2012215235600341/ about slow model loading with an accompanying perf report (report.html) caused me to look at the report and find hot spots during model loading. This suggested that we spend quite a bit of time looking up operators from the dispatcher. This means that we can probably just cach the operator handler functions (instead of computing them every time the operator name shows up since it potentially shows up multiple times in a given model). This diff results in an approx 7% speedup in model loading time (from [315ms](https://www.internalfb.com/intern/aibench/details/45077128343028) to [293ms](https://www.internalfb.com/intern/aibench/details/600870874797229)) when run against an 87MB speech model that jiatongzhou provided. See https://fb.workplace.com/groups/pytorch.dev/posts/855724575006024/ for the previous post from jiatongzhou. ghstack-source-id: 134634612 Test Plan: Run using AI Bench. ### Speech Transducer v25 model (87MiB) Followed up with jiatongzhou and he gave me his speech model. For posterity, here's how to fetch it (you don't need to since I uploaded it to NMLML and now has a permanent Everstore Handle): ``` cd /tmp/ mkdir speech_model cd speech_model fbpkg fetch speech.stella.neural_transducer.on_device.en_us:25 cp pytorchmodel.pt ~/speech_transducer_v25_pytorchmodel.ptl ``` Here's how to build and run the benchmark using AI Bench: ``` buck run aibench:run_bench -- -b aibench/specifications/models/pytorch/speech_transducer/v25.json --framework pytorch --platform android/arm64 --devices "S8US" --force_profile --remote ``` Reviewed By: raziel Differential Revision: D29826210 fbshipit-source-id: 134b67eb466e73f0e43447b9b966278f13c4b56f	2021-07-29 20:14:47 -07:00
Richard Barnes	b5867a1b34	irange-ify 7 (#62117 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62117 Test Plan: Sandcastle Reviewed By: ngimel Differential Revision: D29879640 fbshipit-source-id: 189578a57301747a3421742e145bbcdf2ad75c49	2021-07-28 13:30:39 -07:00
Richard Barnes	a91be24e2d	Modernize make pointers (#61741 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61741 Test Plan: Sandcastle Reviewed By: malfet Differential Revision: D29717385 fbshipit-source-id: 4452b77981e49175f744bdaab12cd225bf75b90e	2021-07-22 15:54:37 -07:00
Tianyi Yu	39ce29efe0	Refactor metadata_map with flattened key/value pair (#61731 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61731 In the previous diff, metadata_map contains mobile_info.json and producer_info.json. We need to parse json each time when we log the required information. This diff helps to flatten the content in the files into key/value pair. It allows logger to directly loop through the metadata_map and log the information. Test Plan: Since 3D Photo is disabled for current FB app, testings are only performed on CC scanner. # Test On CC Scanner Test content with LOG(WARNING) {P429123273} Scuba Logger Output 1. MOBILE_MODULE_LOAD_STATS {F631884673} 2. MOBILE_MODULE_STATS {F631884787} Reviewed By: xcheng16 Differential Revision: D29690702 fbshipit-source-id: 1db5a1f5c25e98e5b2f1cc254fd880dfdfa025e2	2021-07-16 00:37:17 -07:00
Tianyi Yu	00a7f55b6e	Apply for MOBILE_MODULE_STATS Logging (#61600 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61600 This diff changes the module.h constructor, and removes metadata_. It refactors all the constructors caller side, and creates a getter & setting for metadata_. MOBILE_MODULE_STATS reads the metadata from mobile::Module, and pass it into logger. Test Plan: Since 3D Photo is disabled for current FB app, testings are only performed on CC scanner. # Test On CC Scanner Test content with LOG(WARNING) {P428930572} Scuba Logger Output {F631761194} Reviewed By: xcheng16 Differential Revision: D29673184 fbshipit-source-id: 962e0d7b06a07caaa0c695a4ac58b885fd1505ea	2021-07-16 00:37:15 -07:00
Tianyi Yu	fc710eecc0	Apply for MOBILE_MODULE_LOAD_STATS Logging (#61480 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61480 Append mobile_info.json and producer_info.json into extra_files and parse the jsons from “model_info.json” in onExitLoadModel. ghstack-source-id: 133327912 Test Plan: # Test On CC Scanner Test content with LOG(WARNING) {P428339274} Scuba Logger Output {F631024095} # Test On 3D Photo Test content with LOG(WARNING) {P428340927} Scuba Logger Output {F631026739} Reviewed By: xcheng16, guangy10 Differential Revision: D29608014 fbshipit-source-id: abc39c44b947632fd4349de8a432649e84284a87	2021-07-16 00:36:09 -07:00
David Reiss	a682ff7ef1	Add kMaxSupportedBytecodeVersion for Lite Interpreter (#59472 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59472 Previously, the lite interpreter would refuse to load any model with a version greater than kProducedBytecodeVersion. Now, we're able to independently advance the loading and saving code, so we can roll out changes without breaking forward compatibility. Test Plan: CI. Loaded a bytecode v5 model even with setting kProducedBytecodeVersion to v4. Reviewed By: raziel Differential Revision: D28904350 fbshipit-source-id: 598c22f0adf47d4ed3e976bcbebdf3959dacb1df	2021-06-04 17:55:02 -07:00
Mike Ruberry	57e452ff5d	Revert D28856713: [PyTorch Edge] Add proper error message when loading incompatible model with lite interpreter Test Plan: revert-hammer Differential Revision: D28856713 Original commit changeset: c3f9a3b64459 fbshipit-source-id: cc6ba8ec1047f29e62061107a2e5f245981b8039	2021-06-03 08:40:28 -07:00
Chen Lai	91b7bcf4c0	[PyTorch Edge] Add proper error message when loading incompatible model with lite interpreter (#59354 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59354 Check if the model has `bytecode.pkl` and provide proper error message before loading model. Test it by loading a model.pt and model.ptl. ``` >>> from torch.jit.mobile import _load_for_lite_interpreter >>> _load_for_lite_interpreter("/Users/chenlai/Documents/pytorch/data/mobilenet_v2.pt") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/chenlai/pytorch/torch/jit/mobile/__init__.py", line 48, in _load_for_lite_interpreter cpp_module = torch._C._load_for_lite_interpreter(f, map_location) # type: ignore[attr-defined] RuntimeError: The model is not generated from the api _save_for_lite_interpreter. Please regenerate the module by scripted_module._save_for_lite_interpreter('model.ptl'). Refer to https://pytorch.org/tutorials/prototype/lite_interpreter.html for more details. ``` iOS: ![image](https://user-images.githubusercontent.com/16430979/120593077-cbe23180-c3f3-11eb-9745-ee2b04b78c6c.png) Android: ![image](https://user-images.githubusercontent.com/16430979/120594357-af46f900-c3f5-11eb-9fb0-500a038148e3.png) Differential Revision: D28856713 D28856713 Test Plan: Imported from OSS Reviewed By: dhruvbird Pulled By: cccclai fbshipit-source-id: c3f9a3b64459dda6811d296371c8a2eaf22f8b20	2021-06-03 03:18:14 -07:00
Chen Lai	d3fbb41c61	[PyTorch Edge] share tensors in mobile with new api (#58182 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58182 As title, the v5 model format will be ``` (base) chenlai@chenlai-mp reuse_constant % zipinfo /Users/chenlai/Documents/pytorch/reuse_constant/tmp/zip/script_module_v5_unify.ptl Archive: /Users/chenlai/Documents/pytorch/reuse_constant/tmp/zip/script_module_v5_unify.ptl Zip file size: 3120 bytes, number of entries: 7 -rw---- 0.0 fat 77 bl stor 80-000-00 00:00 script_module_v4_unify/data.pkl -rw---- 0.0 fat 240 bl defN 80-000-00 00:00 script_module_v4_unify/code/__torch__/___torch_mangle_5.py -rw---- 0.0 fat 422 bl defN 80-000-00 00:00 script_module_v4_unify/code/__torch__/___torch_mangle_5.py.debug_pkl -rw---- 0.0 fat 64 bl stor 80-000-00 00:00 script_module_v4_unify/constants/140245072983168.storage -rw---- 0.0 fat 172 bl stor 80-000-00 00:00 script_module_v4_unify/constants.pkl -rw---- 0.0 fat 678 bl stor 80-000-00 00:00 script_module_v4_unify/bytecode.pkl -rw---- 0.0 fat 2 bl stor 80-000-00 00:00 script_module_v4_unify/version 7 files, 1655 bytes uncompressed, 1453 bytes compressed: 12.2% ``` bytecode.pkl is: ``` (5, ('__torch__.___torch_mangle_5.TestModule.forward', (('instructions', (('STOREN', 1, 2), ('DROPR', 1, 0), ('LOADC', 0, 0), ('LOADC', 1, 0), ('MOVE', 2, 0), ('OP', 0, 0), ('LOADC', 1, 0), ('OP', 1, 0), ('RET', 0, 0))), ('operators', (('aten::add', 'int'), ('aten::add', 'Scalar'))), ('constants', (torch._utils._rebuild_tensor_v2(pers.obj(('storage', torch.DoubleStorage, '140245072983168.storage', 'cpu', 8),), 0, (2, 4), (4, 1), False, collections.OrderedDict()), 1)), ('types', ()), ('register_size', 2)), (('arguments', ((('name', 'self'), ('type', '__torch__.___torch_mangle_5.TestModule'), ('default_value', None)), (('name', 'y'), ('type', 'int'), ('default_value', None)))), ('returns', ((('name', ''), ('type', 'Tensor'), ('default_value', None)),))))) ``` constants.pkl is: ``` (torch._utils._rebuild_tensor_v2(pers.obj(('storage', torch.DoubleStorage, '140245072983168.storage', 'cpu', 8),), 0, (2, 4), (4, 1), False, collections.OrderedDict()),) ``` Both tensors will refer to the tensor in at the path `script_module_v4_unify/constants/140245072983168.storage`. ## Note According to unify format, all tensors will be written to the folder `.data`, however, torch.jit.load() can't handle the unified format at this moment, so this change will write tensors at the `constants` folders, and mobile will write/read tensors from `constants` folder. such that the model can be interpreted by both jit and mobile. ghstack-source-id: 129010347 Test Plan: buck test mode/dev //caffe2/test/cpp/jit:jit Reviewed By: raziel, iseeyuan Differential Revision: D28375257 fbshipit-source-id: 6544472db4c957c5ea037e0bb5112b637dd15897	2021-05-14 14:03:56 -07:00
Lillian Johnson	07de11c26d	[torch.Package/TorchScript] TS serialization importer to handle unified format (#54891 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54891 Changed TorchScript's jit/serialization importer logic to handle both original TS serialization format and new unified TS format Original TS file format: ``` resnet.pt ├── data # tensor data │ ├── 94286146172688 │ ├── 94286146172784 │ └── ... ├── code/ # TorchScript code │ ├── __torch__ │ │ ├── torch │ │ │ └── nn ... │ │ └── torchvision ... │ ├── __torch__.py │ └── __torch__.py.debug_pkl ├── data.pkl # the ScriptModule object, pickled by the TS pickler ├── version # version metadata ├── constants.pkl # any tensor constants present in the TS code └── extra ├── name_of_file └── foo ``` Unified file format: ``` ─── package_name.pt ├── .data │ ├── ts_code # code shared between models │ │ ├── 0 │ │ │ ├── constants.pkl │ │ │ └── data.pkl │ │ ├── 1 │ │ │ ├── constants.pkl │ │ │ └── data.pkl │ │ └── code │ │ ├── __torch__ │ │ │ ├── torch │ │ │ │ └── nn ... │ │ │ └── torchvision ... │ │ ├── __torch__.py │ │ └── __torch__.py.debug_pkl │ ├── 0.storage │ ├── 1.storage │ ├── <many more storages> │ ├── 201.storage │ ├── extern_modules │ └── version └── res ├── mod.pkl # maps to ts_id 0 and .data/ts_code/0 └── mod2.pkl # maps to ts_id 1 and .data/ts_code/1 ``` Test Plan: Imported from OSS Reviewed By: suo Differential Revision: D27832548 Pulled By: Lilyjjo fbshipit-source-id: 4a6e84c3a9bac8eed6a4e4afc2ac76dd691858b0	2021-05-14 08:20:34 -07:00
Martin Yuan	d833caaf6b	[PyTorch Mobile][Forward/backward compatibility] Number of arguments for operators (#56845 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56845 Handle forward/backward compatibility caused by added default arguments in mobile. As an example, In older version, operator aten::foo's schema is ``` foo(Tensor a, Tensor b) -> Tensor ``` In the new version, the schema is updated to ``` foo(Tensor a, Tensor b, int groups=1) -> Tensor ``` ## Model file Serialize the number of specified arguments to each operator into the bytecode operator table. Before the operator table contains operator name and overload name: ``` ('operators', (('aten::foo', ''),)) ``` Now the number of specified arguments is added: ``` # bytecode version 6 ('operators', (('aten::foo', '', 2),)) ``` where "2" means the number of specified arguments. Since there's bytecode schema change, the bytecode version number is bumped. This PR is to be landed after #56002 , where the version number is bumped from 4 to 5. This PR bumps the version number from 5 to 6. ## Runtime and backward compatibility When the operator is found (either jit or c10), we have the OperatorHandle, where the operator schema can be accessed by ``` op.value().schema().arguments() ``` Adaptation is implemented to handle backward compatibility. For the example above, the new runtime holds the updated schema: ``` foo(Tensor a, Tensor b, int groups=1) -> Tensor ``` Whereas the model file carries ``` (('aten::foo', ''), 2) ``` We can implement a wrapper around the original function pointer to push the default argument to the stack. ## Deliver time and forward compatibility At model delivery time, two checks can be done: ### Operator check Two APIs to be provided: * Runtime: An API to get a runtime’s ops and their schemas (i.e. the # of args). D27920185(WIP) * Model: An API to get a model’s ops and their schema requirements (i.e. the # of args required). The APIs can be used to check * runtime.ops() is a superset of model.ops() * for each op in model.ops() validate their schemas are compatible with those in runtime.ops() -- i.e. the # args required in a model op are <= # args in the runtime op. Note that only root ops in the model needs to be checked here. For transient ops it's not necessary. For example, if a root op, "aten::root" calls "aten::foo", it's "aten::root"'s responsibility to adapt to "aten::foo"'s change, or "aten::root" itself needs to be updated too. ### Bytecode version backport (PR coming) When delivering a model with bytecode v6, if the runtime only works with bytecode v5 and lower, backport is needed. * The number of arguments is removed from the operator table * The bytecode version is changed from 6 to 5 Note that this backport is a pure format change, it does not guarantee the backported model always runs in old runtime. The operator check mentioned before should be done first, before it’s back ported to v5. Test Plan: Imported from OSS Reviewed By: gmagogsfm Differential Revision: D27986544 Pulled By: iseeyuan fbshipit-source-id: 143e19d4798cfb96b65095538dd648eead4e3fda	2021-05-13 14:20:47 -07:00
Chen Lai	fb9a32b7b4	[PyTorch][Edge] Add api to get bytecode model version (#56801 ) Summary: Add an api `_get_bytecode_version` to get version number given a bytecode model in both cxx and python, and the input can be both from file path and buffer. ## Test CI (new added unit test will run as part of `pytorch_core-buck`) 1. run test_lite_interpreter.cpp 2. `python test/mobile/test_bytecode.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/56801 ghstack-source-id: 128169647 Test Plan: CI (new added unit test will run as part of `pytorch_core-buck`) 1. run test_lite_interpreter.cpp 2. `python test/mobile/test_bytecode.py` Reviewed By: iseeyuan Differential Revision: D27961417 fbshipit-source-id: f786cc9573d855feecff0b4fe8e5363e25f5728c	2021-05-05 09:17:26 -07:00
Dhruv Matani	d728491fc1	[RFC] [PyTorch Edge] Simplify error logging in mobile/import.cpp (#55711 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55711 Currently, there is some complex logic that tries to handle all exceptions but re-throws them as a `c10::Error` so that it can log the error message. I'm looking for context on why this was added. The current logic (after talking with swolchok) seems equivalent, simpler, and also preserves the original stack trace from where the exception was originally thrown. This is useful when viewing the backtrace in logview. Re-throwing an exception using `TORCH_CHECK(false, message)` results in the original exception stack trace getting lost, so we want to avoid that. ghstack-source-id: 128043281 Test Plan: Build. Reviewed By: iseeyuan Differential Revision: D27688352 fbshipit-source-id: b7b1a29b652b31da80d72f16d284e48b8623377b	2021-05-04 20:45:32 -07:00
Kimish Patel	bb3c6699a5	[Pytorch Mobile DebugInfo Serialization] Save debug handles for all instructions. (#55252 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55252 Earlier for bytecode serialization we were saving debug handles only for OPs and not all instructions. This PR makes changes to add that for all instructions. Test Plan: python test/mobile/test_lite_script_module.py TestLiteScriptModule Imported from OSS Reviewed By: dreiss Differential Revision: D27542502 fbshipit-source-id: cff75118c721ce9f0c2f60d2c9471481f05264ca	2021-05-04 09:21:13 -07:00
Kimish Patel	e0fc473e47	[Pytorch, Mobile] Serialize inlined callstack pointer with debug handle. (#55062 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55062 This diff introduces the following changes: 1. InlinedCallStack pickler/serializer is introduced. It is serialized as a tuple of {module_instance_info, source range tag, callee:InlinedCallStack} Module instance info is serialized as tuple of {class_type_name, instance_name}. Note that callee of the serialized inlined callstack points to the tuple of already serialized callstack. This means the first callstack ptr to serialize, will serialize entire path of the tree, where some callee nodes might be shared with callstack pointers that will be serialized subsequently. Pickler supports memoization of pickled objects, where if a tuple has been serialized then object id is obtained instead of serialized object again. Thus we stll serialize the tree and not every path from the root separately. Furthermore, InlinedCallStackSerializer also uses cache to lookup the pointer and return the serialized IValue. Furthermore, note that we must also serialize the source range of InlinedCallStack. In order to this serializer requires map of source-range-tags-to-source-range map. This was done in the previous diff, where as part of source range serialization we also generate unique tags. These are the tags that are serialized in InlinedCallStack. Thus during deserialization we would have to deserialize source range before deserializing InlinedCallStacks. 2. Furthermore, each serialized InlinedCallStack is serialized with a unique debug_handle and source range tag. BackendDebugHandleManager manages generation of unique debug handles and saves the map of debug-handles-to-{source_range_tag, inlined-callstack-ptr}. This map is then serialized as callstack_debug_map.pkl. Note that inlined callstack is not sufficient to get all the source information since it contains source information about the nodes which are inlined. The top-of-the-stack (or bottom) node, which is the actual op node, is not part of the inlined callstack pointer and thus the source range of this node is serialized separately using source_range_tag. This is similar to how JIT creates callstack in torch/csrc/jit/runtime/interpreter.cpp Unique debug handles facilitates exception throwing or profiling using just the debug handle without any further qualifications, such as which function or module the inlined-callstack belongs to. Furthermore, this diff refactors the old mobile code for tracking module hierarchy information per op. Mainly now bytecode serialization will serialize debug handles corresponding to ops/nodes in graph and have callstack_debug_map.pkl help generate: 1. Entire callstack and 2. Module hierarchy information. Test Plan: python test/mobile/test_lite_script_module.py TestLiteScriptModule ./build/bin/test_jit --gtest_filter=*ModuleInfo Imported from OSS Reviewed By: raziel Differential Revision: D27468709 fbshipit-source-id: 53e2413e7703ead01c77718b7c333c7c6ff50a23	2021-05-04 09:21:12 -07:00
Kimish Patel	f4a921600a	[PyTorch, Mobile] Serialization format change for source range (#54284 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54284 In order to bring mobile deployment, via lite interpreter, on feature parity with JIT, with respect model level debug information we must make model level debug information available to mobile runtime. At the moment, model level debug information is stored in SourceRange which associates node's of graph to where the come from in original python source code. This information is serialized as part of debug_pkl and deserialized when JIT loads the model and reads the model code. On lite interpreter, we do not have access to all the functionality of JIT and hence we cannot load model in the same way as JIT, by reading code, constructing module hierarchy and graph corresponding module methods etc. Instead in, lite interpreter, only bytecode corresonding to the compiled graph, Code, is saved. Thus in order to annotate OPs in the bytecode with equivalent SourceRange information we do the following: 1. During model serialization, we create a unique tag for each source range of the model. 2. Create a map of <SourceRange, tag> 3. During debug_pkl serialization we save tag along with SourceRange, on top of byte offset. 4. During bytecode generation, the methods of the top module are lowered. During this process methods are inlined. In the inlined graph, when the node of a graph is lowered to bytecode, we query node's source range and look it up against the map. 5. Resulting source range tag is serialized in module_debug_info. 6. During model deserialization, we read all the debug_pkl records in the archieve and create a map of <tag, SourceRange> 7. This map can be used to find source code information. During mobile runtime: 1. We read all the debug_pkl records and create <tag=debug_handle, SourceRange> map. 1.1 This map, MobileDebugInfo, is a member of mobile Module. 2. Interpreter catches appropriate exceptions and sets the thread local debug handle and rethrows the exception. 3. In Function's run method we catch exception and query current debug handle where the exception happened. 4. Query MobileDebugInfo with debug handle to retrieve source range and augment error with source range info. This information is still incomplete as it does not contain entire callstack. In the following diffs we will serialize InlinedCallStack directly. Note that compilation is gated by SYMBOLICATE_MOBILE_DEBUG_HANDLE macro, so that mobile builds can avoid building MobileDebugInfo, source range and source range pickler/unpickler. Later we will add path where, if building without debug support stack trace will contain only debug handles. They can be symbolicated later. Test Plan: Ported bunch of source range tests from test_jit.py. Added on more test in test_lite_interpreter.py Imported from OSS Reviewed By: raziel Differential Revision: D27174722 fbshipit-source-id: a7b7c6088ce16dec37e823c7fefa4f0b61047e12	2021-05-04 09:19:27 -07:00
Chen Lai	9486fc3229	[PyTorch][Edge] share readArchiveAndTensors between mobile and jit (#57098 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57098 1. Separate `readArchiveAndTensors()` from `jit/import.cpp` to a new file `jit/import_read.cpp`. 2. Use `readArchiveAndTensors()` in `mobile/import.cpp` ghstack-source-id: 127703081 3. Add a util function in cpp that could read .pkl files directly instead of loading the entire module Test Plan: CI Reviewed By: raziel, iseeyuan Differential Revision: D28052193 fbshipit-source-id: c8d57f3270bdcf2e52a32f7c111899bd5da7cac2	2021-04-29 10:09:50 -07:00
Nikita Shulga	4cb534f92e	Make PyTorch code-base clang-tidy compliant (#56892 ) Summary: This is an automatic change generated by the following script: ``` #!/usr/bin/env python3 from subprocess import check_output, check_call import os def get_compiled_files_list(): import json with open("build/compile_commands.json") as f: data = json.load(f) files = [os.path.relpath(node['file']) for node in data] for idx, fname in enumerate(files): if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'): files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')] return files def run_clang_tidy(fname): check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"]) changes = check_output(["git", "ls-files", "-m"]) if len(changes) == 0: return check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"]) def main(): git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n") compiled_files = get_compiled_files_list() for idx, fname in enumerate(git_files): if fname not in compiled_files: continue if fname.startswith("caffe2/contrib/aten/"): continue print(f"[{idx}/{len(git_files)}] Processing {fname}") run_clang_tidy(fname) if __name__ == "__main__": main() ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892 Reviewed By: H-Huang Differential Revision: D27991944 Pulled By: malfet fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179	2021-04-28 14:10:25 -07:00
Dhruv Matani	bd3c63aeeb	[PyTorch Edge] Move torch::jit::mobile::_export_operator_list() from serialization/export_module.cpp to mobile/import.cpp (#56044 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56044 We want to be able to drop the dependence of full-jit deps in the auto-generated unit tests for 2 reasons: 1. Running bloaty on the auto-generated unit tests should be somewhat representative of the actual size. 2. The runtime environment of the auto-generated unit tests should be as close to the production environment as possible to ensure that we are running the tests in a production-like runtime. Due to the dependece on full-jit, we aren't there yet. For the auto-generated tests, we probably don't need to depend on `_export_operator_list()` evetually, but for now we do since it is used to decide whether the model being run is a Metal GPU model or a CPU model, and gates whether the test runs that model or not. Eventually, we can stop doing this in the test and do it in the codegen from PTM-CLI instead (by fetching the operators from that tool, and writing out to the BUCK file which backend(s) this model is targeting). However, that will take some time to land, so in the spirit of expediency, this change is being proposed. Discussed this offline with iseeyuan ghstack-source-id: 126656877 Test Plan: Build + BSB. Reviewed By: iseeyuan Differential Revision: D27694781 fbshipit-source-id: f31a2dfd40803c02f4fd19c45a3cc6fb9bdf9697	2021-04-15 17:53:36 -07:00
Dhruv Matani	2c5579702a	[PyTorch Mobile] Add module size to logged metadata (#53578 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53578 We want to be able to log the loaded module size to the scuba table `qpl_metrics/pytorch`. Hence, adding the `model_size` field to the logged metadata when logging a module load success event. ghstack-source-id: 123980964 Test Plan: xcheng16 How should this be tested? Reviewed By: xcheng16, raziel Differential Revision: D26902971 fbshipit-source-id: a7c2e9120706bd31f76f6572c8503d4acf8a89e2	2021-03-15 21:11:36 -07:00
Dhruv Matani	b26c0bb2b9	[PyTorch Mobile] Allow skipping operator exists check when bytecode model is loaded (#52814 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52814 Currently, there is no way to load a model on a devvm (CPU) if that model has operators that the runtime doesn't support. This ends up happening (currently) for Metal GPU models, and potentially in the future for other backends that have backend-specific operators that don't have a registered implementation (even a dummy one) on CPU. There are at least a couple reasons for why this is needed: 1. We want to extract operator list directly from the bytecode (instead of looking it up from `mobile_info.json). 2. We want to be able to trace the quantized operators that are invoked when loading the compressed weights for a model that has prepacked weights. xta0 root-caused this after husthyc discovered that there are untraced operators showing up when loading a Metal GPU model. If we want to scale out to support different types of models, we absolutely need the ability to load a model on a devvm irrespective of what backend (device/etc...) it is targeted at. ghstack-source-id: 123284366 Test Plan: The next diff in this stack is using the newly introduced methods. Reviewed By: iseeyuan Differential Revision: D26656266 fbshipit-source-id: eed9af2f7b55979e9c18b986b8c3b9a767153297	2021-03-07 02:56:12 -08:00
Martin Yuan	b5ae8e69a7	[Lite Interpreter] Support features from to_backend (#52870 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52870 Add the missing parts to support to_backend modules by lite interpreter. 1. Add ISINSTANCE instruction support, which is used in to_backend for output type check. 2. Bypass lite interpreter's type parser by checking the qualified name. If it starts with "torch.jit", use the same type resolver as nn module (starting with "__torch__"). Tests Mobile module is serialized and loaded in ```BackendTest.TestCompiler```. The results are compared to those from original torchscript module. Test Plan: Imported from OSS Reviewed By: raziel Differential Revision: D26715351 Pulled By: iseeyuan fbshipit-source-id: ad9d74ee81c6aa692ab9e5dd7a9003bae5d4f01f	2021-03-01 17:56:01 -08:00
Richard Barnes	26419815af	Modernize for-loops (#52330 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52330 Test Plan: Sandcastle Reviewed By: mruberry Differential Revision: D26001961 fbshipit-source-id: e75cc8f1a8d30917b4d55df9e1a3c7836c271820	2021-02-23 17:32:33 -08:00
Martin Yuan	23c50a4a50	[PyTorch Mobile] Support torchbind custom classes in lite interpreter (#51432 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51432 ghstack-source-id: 120976584 torchbind is a convenient way to include custom class to both python and torchscript. CREATE_OBJECT is used to create an object of custom class. CREATE_OBJECT was not supported by lite interpreter. The major reason was that for custom class directly defined in Python, there's no language parser in lite interpreter. It's still the case. However, for torchbind classes that are defined in C++, a python/torchscript parser is not needed. This diff is to support the case of torchbind custom classes. 1. The class type can be resolved at import level. 2. If the class is not the supported torchbind class, an error message is provided at export stage. Workaround is also suggested. 3. Unit tests. C++: ```LiteInterpreterTest::BuiltinClass``` is added as an end-to-end test on supported class. Python: ```test_unsupported_createobject``` is changed to ```test_unsupported_classtype``` to test unsupported classes. Test Plan: CI Reviewed By: raziel Differential Revision: D26168913 fbshipit-source-id: 74e8b6a12682ad8e9c39afdfd2b605c5f8e65427	2021-02-03 21:57:19 -08:00
Frank Seide	87ad77eb4e	T66557700 Support default argument values of a method (#48863 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48863 Support default arguments when invoking a module via PyTorch Lite (`mobile::Module`). Test Plan: buck test mode/dbg //caffe2/test/cpp/jit:jit -- LiteInterpreterTest.MethodInvocation buck test mode/dbg caffe2/test:mobile -- test_method_calls_with_optional_arg Reviewed By: iseeyuan Differential Revision: D25896212 fbshipit-source-id: 6d7e7fd5f3244a88bd44889024d81ad2e678ffa5	2021-02-01 18:35:13 -08:00
Dhruv Matani	ebe26b81d2	[PyTorch Mobile] Enable partial loading of GPU models on linux CPU machines (#51236 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51236 The problem we currently have with tracing is that GPU models can't load on devvm CPU machines. Here's why: 1. The Metal GPU ops don't exist so the validation that checks for missing ops kicks in and prevents loading 2. Even if the check for missing ops is commented out, the actual model contents can't be succssfully loaded (see t83364623 for details) Hence, to work around these problems and allow tracing to detect GPU models, and skip actual tracing for these (as discussed in the meeting 2 weeks ago and based on recommendations from raziel, iseeyuan, and xta0), we're adding code to detect these GPU models based on the set of operators that show up in the file `extra/mobile_info.json`. The code then skips tracing, and picks up the root operators from the model itself. The diff below this one will be removed before landing since we don't want to check in the model - I've kept it here in case anyone wants to patch this diff in and run the command on their devvm locally. ghstack-source-id: 120638092 Test Plan: See {P168657729} for a successful run of tracing on a GPU model (person segmentation tier-0, v1001) provided by xta0 Also ran `buck test //xplat/pytorch_models/build/...` successfully. Reviewed By: ljk53 Differential Revision: D26109526 fbshipit-source-id: 6119b0b59af8aae8b1feca0b8bc29f47a57a1a67	2021-01-29 01:00:08 -08:00
Dhruv Matani	ce0f335515	[PyTorch Mobile] Add an overload for deserialize() that doesn't accept the extra_files map. (#50932 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50932 After the change to split `_load_for_mobile()` into multiple methods, one which takes in the `extra_files` map, and one which doesn't, we can change the implementation of the `deserialize()` method with different overloads as well. Suggested by raziel on D25968216 (`bb909d27d5`). ghstack-source-id: 120185089 Test Plan: Build/Sandcastle. Reviewed By: JacobSzwejbka Differential Revision: D26014084 fbshipit-source-id: 914142137346a6246def1acf38a3204dd4c4f52f	2021-01-22 21:54:24 -08:00
Dhruv Matani	bb909d27d5	[PyTorch Mobile] Eliminate static default_extra_files_mobile from header import.h (#50795 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50795 There's [a post](https://fb.workplace.com/groups/2148543255442743/permalink/2583012411995823/) about a customer having to pass in `-Wno-global-constructors` to disable warnings related to calling constructors for global objects. This is related to the initialization of `default_extra_files_mobile` in `import.h`. It requires end users to pass in the compiler flag, since the definition is now in code (.cpp files) that they will be compiling. In addition, it makes the API for `_load_for_mobile` non-re-entrant (i.e. can not be safely used concurrently from multiple threads without the caller taking a mutex/lock) if the `extra_files_mobile` argument is not explicitly passed in. Instead, a better option would be to create different overloads; one which requires all 3 parameters, and one that can work with 1-2. This solves the problem without creating a static variable. ghstack-source-id: 120127083 Test Plan: Build Lite Interpreter and sandcastle. Reviewed By: raziel Differential Revision: D25968216 fbshipit-source-id: fbd80dfcafb8ef7231aca301445c4a2ca9a08995	2021-01-21 21:22:48 -08:00
Chen Lai	e05882d2a4	Back out "reuse consant from jit" (#50521 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50521 Original commit changeset: 9731ec1e0c1d Test Plan: - run `arc focus2 -b pp-ios //xplat/arfx/tracking/segmentation:segmentationApple -a ModelRunner --force-with-bad-commit ` - build via Xcode, run it on an iOS device - Click "Person Segmentation" - Crash observed without the diff patched, and the segmentation image is able to be loaded with this diff patched Reviewed By: husthyc Differential Revision: D25908493 fbshipit-source-id: eef072a8a3434b932cfd0646ee78159f72be5536	2021-01-14 09:50:40 -08:00
Andres Suarez	8530c65e25	[codemod][fbcode/caffe2] Apply clang-format update fixes Test Plan: Sandcastle and visual inspection. Reviewed By: igorsugak Differential Revision: D25849205 fbshipit-source-id: ef664c1ad4b3ee92d5c020a5511b4ef9837a09a0	2021-01-09 14:37:36 -08:00
Chen Lai	d4c1684cf5	reuse consant from jit (#49916 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49916 Test Plan: 1. Build pytorch locally. `MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ USE_CUDA=0 DEBUG=1 MAX_JOBS=16 python setup.py develop` 2. Run `python save_lite.py` ``` import torch # ~/Documents/pytorch/data/dog.jpg model = torch.hub.load('pytorch/vision:v0.6.0', 'shufflenet_v2_x1_0', pretrained=True) model.eval() # sample execution (requires torchvision) from PIL import Image from torchvision import transforms import pathlib import tempfile import torch.utils.mobile_optimizer input_image = Image.open('~/Documents/pytorch/data/dog.jpg') preprocess = transforms.Compose([ transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), ]) input_tensor = preprocess(input_image) input_batch = input_tensor.unsqueeze(0) # create a mini-batch as expected by the model # move the input and model to GPU for speed if available if torch.cuda.is_available(): input_batch = input_batch.to('cuda') model.to('cuda') with torch.no_grad(): output = model(input_batch) # Tensor of shape 1000, with confidence scores over Imagenet's 1000 classes print(output[0]) # The output has unnormalized scores. To get probabilities, you can run a softmax on it. print(torch.nn.functional.softmax(output[0], dim=0)) traced = torch.jit.trace(model, input_batch) sum(p.numel() * p.element_size() for p in traced.parameters()) tf = pathlib.Path('~/Documents/pytorch/data/data/example_debug_map_with_tensorkey.ptl') torch.jit.save(traced, tf.name) print(pathlib.Path(tf.name).stat().st_size) traced._save_for_lite_interpreter(tf.name) print(pathlib.Path(tf.name).stat().st_size) print(tf.name) ``` 3. Run `python test_lite.py` ``` import torch from torch.jit.mobile import _load_for_lite_interpreter # sample execution (requires torchvision) from PIL import Image from torchvision import transforms input_image = Image.open('~/Documents/pytorch/data/dog.jpg') preprocess = transforms.Compose([ transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), ]) input_tensor = preprocess(input_image) input_batch = input_tensor.unsqueeze(0) # create a mini-batch as expected by the model reload_lite_model = _load_for_lite_interpreter('~/Documents/pytorch/experiment/example_debug_map_with_tensorkey.ptl') with torch.no_grad(): output_lite = reload_lite_model(input_batch) # Tensor of shape 1000, with confidence scores over Imagenet's 1000 classes print(output_lite[0]) # The output has unnormalized scores. To get probabilities, you can run a softmax on it. print(torch.nn.functional.softmax(output_lite[0], dim=0)) ``` 4. Compare the result with pytorch in master and pytorch built locally with this change, and see the same output. 5. The model size was 16.1 MB and becomes 12.9 with this change. Imported from OSS Reviewed By: kimishpatel, iseeyuan Differential Revision: D25731596 Pulled By: cccclai fbshipit-source-id: 9731ec1e0c1d5dc76cfa374d2ad3d5bb10990cf0	2021-01-08 22:39:28 -08:00
Dhruv Matani	4a870f6518	[PyTorch Mobile] Export Operator List from Mobile CompilationUnit instead of from TorchScript Model (#49385 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49385 Currently, the API to export operator lists accepts a `torch::jit::Module` object, and spits out an operator list. The operator list is practically used only for mobile. This is not ideal because the set of root operators may change by the time the model is subsequently optmized and exported for mobile. What we need to to instead is glean the list of operators from the mobile model itself (`bytecode.pkl` specifically), and expose that instead. Also updated the logic in `converter`. ### Before this change: 1. Get operator List from Torch Script Model 2. Convert to bytecode mobile model ### After this change: 1. Convert to bytecode mobile model 2. Use this converted mobile model to get the list of operators for each method on the model ghstack-source-id: 118796752 Test Plan: Added a unit test in `test_lite_interpreter.cpp` to ensure that all model referenced operators show up in the exported operator list. Also make `test_lite_interpreter.cpp` runnable from `xplat/caffe2/BUCK` since this is where the production code will be built from. Verified that the list of operators produced before and after this change for an example model (segmentation) are the same. {P147863234} Also verified that the operator lists for BI-Xray model is different (we have been having problems with missing operators for this one): {P154903132} Reviewed By: iseeyuan Differential Revision: D24690094 fbshipit-source-id: 0426a6ef90456a811010cfe337c415882ae2deff	2020-12-18 11:17:57 -08:00
Martin Yuan	2b61e4d84c	Revert D25152559: T66557700 Support default argument values of a method Test Plan: revert-hammer Differential Revision: D25152559 (`6bde0ca6d3`) Original commit changeset: bbf52f1fbdbf fbshipit-source-id: 592fdb3078b1ac86cd394adc6c1bfd6b10d829e1	2020-12-17 14:05:49 -08:00
Frank Seide	6bde0ca6d3	T66557700 Support default argument values of a method (#48863 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48863 Support default arguments when invoking a module via PyTorch Lite (`mobile::Module`). Test Plan: buck test mode/dbg //caffe2/test/cpp/jit:jit -- LiteInterpreterTest.MethodInvocation buck test mode/dbg caffe2/test:mobile -- test_method_calls_with_optional_arg Reviewed By: raziel, iseeyuan Differential Revision: D25152559 fbshipit-source-id: bbf52f1fbdbfbc6f8fa8b65ab524b1cd4648f9c0	2020-12-16 15:55:03 -08:00
Martin Yuan	a1fef453b6	Support extra files in _load_for_mobile (#47425 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47425 Extra files can be exported in lite interpreter model, but it could not be loaded. This PR is to add the capability to load extra files from lite interpreter model. Because extra_files is a default argument, it should not affect the existing usage of _load_for_mobile. It's a simple assembly or a generic unordered_map. No additional dependency should be introduced and the size overhead should be small (to be tested). Test Plan: Imported from OSS Reviewed By: kwanmacher Differential Revision: D24770266 Pulled By: iseeyuan fbshipit-source-id: 7e8bd301ce734dbbf36ae56c9decb045aeb801ce	2020-11-06 20:26:54 -08:00

1 2

89 Commits