pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 00:20:18 +01:00

Author	SHA1	Message	Date
Yuanyuan Chen	9fff8155c3	[2/N] Fix clang-tidy readability checks (#164652 ) This PR applies clang-tidy readability checks to jit sources and all headers in the code base. `readability-redundant-inline-specifier` is suppressed because it incurs too many changes. `readability-redundant-inline-specifier` is used to detect redundant inline specifiers on function and variable declarations. There are many in-class method definitions that are marked inline. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164652 Approved by: https://github.com/Skylion007	2025-10-06 01:06:01 +00:00
PyTorch MergeBot	2c5ed6e7c0	Revert "[2/N] Fix clang-tidy readability checks (#164652 )" This reverts commit `3c5ca685d6`. Reverted https://github.com/pytorch/pytorch/pull/164652 on behalf of https://github.com/izaitsevfb due to need to revert due to a conflict with revert of https://github.com/pytorch/pytorch/pull/162659 ([comment](https://github.com/pytorch/pytorch/pull/164652#issuecomment-3369346707))	2025-10-05 21:36:57 +00:00
Yuanyuan Chen	3c5ca685d6	[2/N] Fix clang-tidy readability checks (#164652 ) This PR applies clang-tidy readability checks to jit sources and all headers in the code base. `readability-redundant-inline-specifier` is suppressed because it incurs too many changes. `readability-redundant-inline-specifier` is used to detect redundant inline specifiers on function and variable declarations. There are many in-class method definitions that are marked inline. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164652 Approved by: https://github.com/Skylion007	2025-10-05 07:05:11 +00:00
cyyever	24ca7e91e6	[1/N] Use internal linkage in torch/csrc C++ files. (#150930 ) Turn more functions and variables into static if they are not used outside the cpp files. Unused functions are removed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/150930 Approved by: https://github.com/Skylion007 Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com>	2025-04-11 02:19:31 +00:00
FFFrog	af0bc75460	Remove deprecated alias macro(1/3) (#137556 ) Detailed Descriptions: - Remove AT_ERROR Macro Pull Request resolved: https://github.com/pytorch/pytorch/pull/137556 Approved by: https://github.com/ezyang	2024-10-21 17:32:32 +00:00
cyy	8967d55b01	[18/N] Fix clang-tidy warnings in jit (#132963 ) Follows #132753 Pull Request resolved: https://github.com/pytorch/pytorch/pull/132963 Approved by: https://github.com/Skylion007	2024-08-09 01:27:32 +00:00
cyy	efb73eda51	[2/N] Fix some violations of unused-function and unused-variable checks in torch_cpu (#129878 ) Follows #128670 Pull Request resolved: https://github.com/pytorch/pytorch/pull/129878 Approved by: https://github.com/ezyang	2024-07-04 00:39:28 +00:00
Andrew Calvano	649f2e3000	Fix for out of bounds registers_ access in mobile TorchScript interpreter (#110300 ) Summary: The TorchScript interpreter had multiple opcodes whose logic had the potential to access the registers_ array out of bounds. This change ensures that all registers_ accesses are in bounds or an exception will be thrown. Test Plan: contbuild + OSS signals Differential Revision: D49748737 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110300 Approved by: https://github.com/malfet, https://github.com/kimishpatel	2024-01-31 19:40:02 +00:00
Andrew Calvano	02e2158e75	Fix for out of bounds read in mobile interpreter INTERFACE_CALL opcode handler (#110301 ) Summary: The INTERFACE_CALL opcode for the mobile TorchScript interpreter contained an out of bounds read issue leading to memory corruption. This change adds an explicit check that the number of inputs passed to the format method called when handling the INTERFACE_CALL opcode is a valid and within bounds of the stack. Test Plan: contbuild + OSS signals Differential Revision: D49739450 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110301 Approved by: https://github.com/dbort	2023-12-28 22:09:03 +00:00
Octavian Guzu	4fbf884f58	[fuzzing result][fuzz_torch_jit_lite_interpreter] read-heap-buffer-overflow-far-from-bounds (size 4) in c10::IValue::IValue() (#110453 ) Summary: This diff fixes an OOB read found by fuzzing in torch/../jit/mobile Test Plan: CI and ``` arc lionhead crash reproduce 853835926354224 ``` doesn't crash anymore. Differential Revision: D49537377 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110453 Approved by: https://github.com/davidberard98	2023-10-30 20:08:22 +00:00
Octavian Guzu	9c7071b0e3	[fuzzing result][fuzz_torch_jit_lite_interpreter] read-heap-use-after-free (size 8) in std::_Function_base::_M_empty() (#110289 ) Summary: This diff fixes a heap UAF found by fuzzing in torch/csrc/jit/mobile/interpreter.cpp Test Plan: CI and ``` arc lionhead crash reproduce 1009060456885023 ``` doesn't crash anymore. Reviewed By: malfet Differential Revision: D49538326 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110289 Approved by: https://github.com/malfet	2023-09-29 22:32:38 +00:00
Nikita Shulga	a229e78544	[BE] Enforce sign-compare (#96723 ) Number of OSS PR were reverted, because new signed-unsigned comparison warnings, which are treated as errors in some internal builds. Not sure how those selective rules are applied, but this PR removes `-Wno-sign-compare` from PyTorch codebase. The only tricky part in this PR, as making sure that non-ASCII character detection works for both signed and unsigned chars here: `6e3d51b08a/torch/csrc/jit/serialization/python_print.cpp (L926)` Exclude several files from sign-compare if flash attention is used, due to the violation in cutlass, to be fixed by https://github.com/NVIDIA/cutlass/pull/869 Do not try to fix sign compare violations in caffe2 codebase Pull Request resolved: https://github.com/pytorch/pytorch/pull/96723 Approved by: https://github.com/albanD	2023-03-15 06:04:20 +00:00
Aaron Gokaslan	0247ed27cc	Apply Clang-Tidy readability-container-size-empty (#93236 ) Not only is this change usually shorter and more readable, it also can yield better performance. size() is not always a constant time operation (such as on LinkedLists), but empty() always is. Pull Request resolved: https://github.com/pytorch/pytorch/pull/93236 Approved by: https://github.com/malfet	2023-01-29 23:28:19 +00:00
Aaron Gokaslan	3916d7a575	Apply modernize-use-emplace to aten, c10, torch (#91077 ) Apply clang-tidy check modernize-use-emplace. This is slightly more efficient by using an inplace constructor and is the recommended style in parts of the codebase covered by clang-tidy. This just manually applies the check to rest of the codebase. Pinging @ezyang as this is related to my other PRs he reviewed like #89000 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91077 Approved by: https://github.com/ezyang	2022-12-19 07:49:56 +00:00
Cody Ohlsen	b645c237bc	make g2p ~30% faster on mobile by suppressing a log (#85907 ) Summary: using the tool from D39559248 i was able to make g2p faster on mobile by taking a look at profiles on stella frames. It turned out that the pytorch interpreter code does some logging that ends up being a pretty big bottleneck. Differential Revision: D39901455 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85907 Approved by: https://github.com/dzdang	2022-10-08 01:25:03 +00:00
Zhengxu Chen	cafd0f3304	[jit][edge] Fix array index checking in mobile interpreter. (#73241 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73241 Stop using non-portable out-of-range indexing in mobile interpreter, also change code types indexing to use vector.at() to catch out-of-range bugs earlier. Test Plan: buck test mode/dbg mode/no-gpu -c fbcode.platform=platform010 //caffe2/test/cpp/jit:jit -- BackendTest.TestCompiler Reviewed By: dhruvbird, r-barnes Differential Revision: D34370237 fbshipit-source-id: 1827f75ed00ecc10bbcece48329b0ac87189b079 (cherry picked from commit ab943ef414c8d109bd766f672def63be28af2571)	2022-02-24 19:39:32 +00:00
Pavithran Ramachandran	bf69a61293	(1/2) Make TorchScript Preserve Fully Qualified Class Name for Python Exceptions: backend change Summary: Reland for D33282878 (`911d527b87`) . Land backend change first to maintain FC. Will wait for 2 weeks after this diff is in. And than land the front-end change in next diff. Test Plan: test in next diff time buck test mode/dev-nosan fblearner/flow/projects/langtech/translation:tests -- test_e2e_base_training Reviewed By: gmagogsfm Differential Revision: D33342547 fbshipit-source-id: b3dee9a4bdfd78103848c12629e5fccafdd621e3 (cherry picked from commit `ae1935f1af`)	2022-01-27 03:29:40 +00:00
Zhengxu Chen	4f35b9144c	[jit][edge] Migrate ListType to DynamicType on mobile. (#70212 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70212 Use DynamicType instead of ListType all over the place in Lite Interpreter. Namely we need to modify the following places: 1. Type parser which produces the Type constants. 2. IValue::type() which returns reflected Type from IValues. 3. Helper functions to construct the container value. 4. Typechecks which test whether a type instance is a particular container type. ghstack-source-id: 146818619 Test Plan: CI Reviewed By: iseeyuan Differential Revision: D33176931 fbshipit-source-id: 9144787f5fc4778538e5c665946974eb6171a2e6	2022-01-11 10:57:53 -08:00
Zhengxu Chen	40b80aa490	[jit][edge] Migrate TupleType to DynamicType on mobile. (#70205 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70205 Use DynamicType instead of TupleType all over the place in Lite Interpreter. Namely we need to modify the following places: 1. Type parser which produces the Type constants. 2. IValue::type() which returns reflected Type from IValues. 3. Helper functions to construct the container value. 4. Typechecks which test whether a type instance is a particular container type. ghstack-source-id: 146818620 Test Plan: CI Reviewed By: iseeyuan Differential Revision: D33176925 fbshipit-source-id: 00f7a5db37ba772c912643c733db6c52dfdc695d	2022-01-11 01:01:48 -08:00
Zhengxu Chen	b12ca69179	[jit][edge] Migrate DictType to DynamicType on mobile. (#70202 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70202 Use DynamicType instead of DictType all over the place in Lite Interpreter. Namely we need to modify the following places: 1. Type parser which produces the Type constants. 2. IValue::type() which returns reflected Type from IValues. 3. Helper functions to construct the container value. 4. Typechecks which test whether a type instance is a particular container type. ghstack-source-id: 146735648 Test Plan: no behavior change. Reviewed By: iseeyuan Differential Revision: D33137257 fbshipit-source-id: 971bf431658c422ea9353cc32cdab66e98876e9d	2022-01-10 15:55:29 -08:00
Zhengxu Chen	53b9c0f12d	[jit] Polymorphic IValue::type() for DynamicType. (#70120 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70120 Before the change: ``` c10::Type t = ivalue.type(); ``` After the change: ``` c10::Type t = ivalue.type(); c10::DynamicType d = ivalue.type<c10::DynamicType>(); // new path ``` The new path will be adopted in PyTorch Lite Interpreter to support lightweight type reflection. Note that type getters are selected at compile time so no performance overhead. The benefits of having a DynamicType will be elaborated in a separate document, but in short, DynamicType provides an isolated type system for controlling binary size bloat, and shrink down ~20 supported Type symbols into one so that the size taken by specializations and function name symbols are greatly reduced. Lite Interpreter should only use the `<DynamicType>` variant of the interfaces from aten, to reduce binary size. ghstack-source-id: 146727334 (Note: this ignores all push blocking failures!) Test Plan: CI Reviewed By: gmagogsfm Differential Revision: D33102276 fbshipit-source-id: c5354e7d88f9de260c9b02636214b40fe15f8a10	2022-01-07 18:35:26 -08:00
Zhengxu Chen	62909facb3	[jit] Decouple ivalue.h from jit_type.h (#70119 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70119 JIT type and IValue have a mutual dependency because of various reasons today. It makes things worse when we have `jit_type.h` and `ivalue.h` mutually include each other, causing non deterministic name resolutions at different translation units, preventing us safely use symbols from `jit_type.h` in `ivalue.h` . This diff doesn't address the mutual dependency between JIT type and IValue at linking level, but at header level. We choose to remove include of `ivalue.h` from `jit_type.h` because it's way harder to make a type-free header for IValue. The way we achieve this is by removing EnumType (which is the only type depending on IValue in JIT types) from `jit_type.h`, and let downstream users to specifiy an explicit `enum_type.h` as needed. We also move some IValue inline member function definitions back to `ivalue_inl.h` so that `jit_type.h` doesn't need IValue definition to be present. We also remove a seemingly accidental include of `jit_type.h` from `ATen/core/List_inl.h` so that `ivalue.h` can include `jit_type.h` directly, otherwise due to another mutual inclusion between `ivalue.h` and `List_inl.h` we can still get nondeterministic behavior. ghstack-source-id: 146727333 (Note: this ignores all push blocking failures!) Test Plan: no behavior change. Reviewed By: gmagogsfm Differential Revision: D33155792 fbshipit-source-id: d39d24688004c2ec16c50dbfdeedb7b55f71cd36	2022-01-07 18:34:17 -08:00
Chen Lai	408283319a	[Operator Versioning][Edge] Change OP to CALL when there is a valid upgrader (#67731 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67731 1. Register upgrader function at loading stage 2. Change OP to CALL when there operator_version from model is smaller than current runtime version and there exists a valid upgrader The interpreter log is : ``` RUNNING 0 STOREN 1 3 RUNNING 1 DROPR 1 RUNNING 2 LOAD 2 RUNNING 3 LOAD 3 RUNNING 4 CALL 0 RUNNING 0 STOREN 1 2 RUNNING 1 LOAD 1 RUNNING 2 OP 0, aten::is_floating_point RUNNING 3 JF 3 RUNNING 4 LOADC 1 RUNNING 5 JMP 3 RUNNING 8 STORE 3 RUNNING 9 MOVE 3 RUNNING 10 JF 5 RUNNING 11 LOAD 1 RUNNING 12 LOAD 2 RUNNING 13 OP 1, aten::div.Tensor RUNNING 14 JMP 5 RUNNING 19 STORE 4 RUNNING 20 DROPR 2 RUNNING 21 DROPR 1 RUNNING 22 MOVE 4 RUNNING 23 RET RUNNING 5 LOAD 2 RUNNING 6 LOAD 3 RUNNING 7 CALL 0 RUNNING 0 STOREN 1 2 RUNNING 1 LOAD 1 RUNNING 2 OP 0, aten::is_floating_point RUNNING 3 JF 3 RUNNING 4 LOADC 1 RUNNING 5 JMP 3 RUNNING 8 STORE 3 RUNNING 9 MOVE 3 RUNNING 10 JF 5 RUNNING 11 LOAD 1 RUNNING 12 LOAD 2 RUNNING 13 OP 1, aten::div.Tensor RUNNING 14 JMP 5 RUNNING 19 STORE 4 RUNNING 20 DROPR 2 RUNNING 21 DROPR 1 RUNNING 22 MOVE 4 RUNNING 23 RET RUNNING 8 MOVE 2 RUNNING 9 MOVE 3 RUNNING 10 CALL 0 RUNNING 0 STOREN 1 2 RUNNING 1 LOAD 1 RUNNING 2 OP 0, aten::is_floating_point RUNNING 3 JF 3 RUNNING 4 LOADC 1 RUNNING 5 JMP 3 RUNNING 8 STORE 3 RUNNING 9 MOVE 3 RUNNING 10 JF 5 RUNNING 11 LOAD 1 RUNNING 12 LOAD 2 RUNNING 13 OP 1, aten::div.Tensor RUNNING 14 JMP 5 RUNNING 19 STORE 4 RUNNING 20 DROPR 2 RUNNING 21 DROPR 1 RUNNING 22 MOVE 4 RUNNING 23 RET RUNNING 11 TUPLE_CONSTRUCT 3 RUNNING 12 RET ``` The upgrader bytecode is: ``` (STOREN, 1, 2) (LOAD, 1, 0) (OP, 0, 0) (JF, 3, 0) (LOADC, 1, 0) (JMP, 3, 0) (LOAD, 2, 0) (OP, 0, 0) (STORE, 3, 0) (MOVE, 3, 0) (JF, 5, 0) (LOAD, 1, 0) (LOAD, 2, 0) (OP, 1, 0) (JMP, 5, 0) (LOAD, 1, 0) (LOAD, 2, 0) (LOADC, 0, 0) (OP, 2, 0) (STORE, 4, 0) (DROPR, 2, 0) (DROPR, 1, 0) (MOVE, 4, 0) (RET, 0, 0) ``` ghstack-source-id: 145635622 Test Plan: describe in summary and CI Reviewed By: iseeyuan Differential Revision: D32092517 fbshipit-source-id: 0314b4bda5d2578cdd4e7cfbfd1e3c07fbccf8a3	2021-12-14 19:13:12 -08:00
Han Qi	4eb772fde6	Refactor saving jit::Module to mobile .pt in 2 steps: (#66494 ) Summary: 1. is to convert Function -> mobile::Function 2. is to serialize mobile::Function This also opens opportunity to create mobile::Module without saving/reloading Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/66494 Reviewed By: zhxchen17 Differential Revision: D32293022 Pulled By: qihqi fbshipit-source-id: 29b43d47ff86071d5e2f9d6ca4dba4445711ce3d	2021-11-17 12:02:20 -08:00
Zhengxu Chen	12ede84dbb	[jit][edge] Enable lite interpreter to correctly handle INTERFACE_CALL instruction. (#65972 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65972 ghstack-source-id: 141842336 Test Plan: buck test mode/dev //caffe2/test:mobile -- --exact 'caffe2/test:mobile - test_stacktrace_interface_call (mobile.test_lite_script_module.TestLiteScriptModule)' Reviewed By: qihqi Differential Revision: D31326147 fbshipit-source-id: 338ff4ce8ddc9502ffe0add49057b33b52a24955	2021-10-29 13:13:32 -07:00
Zhengxu Chen	60472594e1	[jit][edge] Implement torch::jit::Function for mobile funciton. (#65970 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65970 ghstack-source-id: 141842338 mobile::Function should inherit from jit::Function, because for interface call support, we need an abstract jit::Function type stored in corresponding ClassTypes, so that we can look up methods in there. Previously mobile::Function is implemented separately which prevents this. Since we get rid of all the unneeded virtual methods from jit::Function, we can inherit from torch::jit::Function without too much cost. NOTE that torch::jit::Function is already in dependency because we need it to support custom class call. We should be able to use Function uniformly without looking into whether it's a builtin function or mobile::Function. Test Plan: no behavior change. Reviewed By: iseeyuan, mrshenli Differential Revision: D31326148 fbshipit-source-id: 36caeaf3c8c5f54c23a1a7c8c9e2fd6e78b19622	2021-10-28 13:33:30 -07:00
Zhengxu Chen	12daa4f663	[jit][edge] Enable CALL instruction in lite interpreter. (#65964 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65964 ghstack-source-id: 141425519 Test Plan: buck run xplat/caffe2:test_lite_interpreter Reviewed By: cccclai Differential Revision: D31326149 fbshipit-source-id: 8a599d92f3fa4e6c125100adb36d89592e71e547	2021-10-25 14:44:33 -07:00
Zhengxu Chen	4dce051cb0	[jit][edge] Add control stack frame to lite interpreter (#65963 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65963 ghstack-source-id: 141425517 Test Plan: In next diff. Reviewed By: qihqi, cccclai Differential Revision: D31326150 fbshipit-source-id: dbbf65f2bf14846c45d0add71edc7d4dbfc6b92c	2021-10-25 12:15:16 -07:00
Scott Wolchok	2d885ab73d	[jit] Reduce refcounting of Types (#65345 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65345 FooType::get() can return a const reference. Inconveniently, converting shared_ptr<FooType> to shared_ptr<Type> requires a copy & refcount bump, so to properly take advantage of this in unshapedType() we need to take a const Type& in isSubtypeOf(), which is good practice anyway -- don't require a shared_ptr if you don't need to take ownership. ghstack-source-id: 140044165 Test Plan: CI perf says c10::unshapedType time decreased from 2.8% to 2.2% during static runtime startup, though I expect this to be generally beneficial. Reviewed By: hlu1 Differential Revision: D31027361 fbshipit-source-id: 676feb81db9f74ad7b8651d8774f4ecb4cfa6ab8	2021-10-08 09:03:04 -07:00
Dhruv Matani	64caee1356	[PyTorch Edge] Leave out field for debug_handle if not being built with eager symbolication support (#66131 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66131 Turns out that a model with 72k instructions causes about 0.5MiB of additional memory overhead (if there's an 8 byte memory overhead per instruction). This is not necessary if we're building w/o eager symbolication support. This change eliminates the 8 byte `debug_handle` if the build is w/o eager symbolication support. ghstack-source-id: 140045478 (Note: this ignores all push blocking failures!) Test Plan: ``` buck build -c "pt.enable_eager_symbolication"=1 //xplat/caffe2/fb/lite_predictor:lite_predictor buck build //xplat/caffe2/fb/lite_predictor:lite_predictor ``` Reviewed By: kimishpatel Differential Revision: D31387784 fbshipit-source-id: af56787ad833b990a46b79ab021e512edaa22143	2021-10-07 20:01:18 -07:00
Kimish Patel	468001600c	Back out "Revert D30327514: [Pytorch lite predictor] Use KinetoEdgeCPUProfiler for operator profiling." (#64307 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64307 Original commit changeset: 0b2aa7c57d08 Restores original changes. This diff changes the way operator profiling is done in lite predictor benchmarking binary. Instead of using custom callbacks it uses KinetoEdgeCPUProfiler to profile events and then generate operator level metric from it. Since KinetoEvents do not contain cpu clock time, now we report only wallclock time. This unifies various profiling effort that we have for benchmarking purpose. In production we will still use observer based mechanism, but the advantage of using kineto profiler is that we get few other things for free, such as: chrome trace generation. operator level memory profiling (to be added) flop counts (to be added) Furthermore possible we can use python post processing script to parse chrome trace and generate output similar to torch.profiler. (To be done) Furthermore removes some tests from test_lite_interpreter.cpp which were testing module hierarchy in debug info. They should be covered by test_mobile_profiler.cpp. Test Plan: aibench run Model without debug info: https://www.internalfb.com/intern/aibench/details/219598441154763 Model with debug info and --print_module_info true (see Operator summary has now module hierarchy information). https://www.internalfb.com/intern/aibench/details/617154236292985 Reviewed By: raziel Differential Revision: D30680354 fbshipit-source-id: b6ba0d59c510c13d13d9935b1d8051cc82ffa4e9	2021-09-01 13:29:35 -07:00
Kimish Patel	67cb131458	Revert D30327514: [Pytorch lite predictor] Use KinetoEdgeCPUProfiler for operator profiling. Test Plan: revert-hammer Differential Revision: D30327514 (`bc9277dca3`) Original commit changeset: 3bb2f2daaaed fbshipit-source-id: 0b2aa7c57d08de77c9aaa75e546a7d0938610f64	2021-08-31 08:30:36 -07:00
Kimish Patel	bc9277dca3	[Pytorch lite predictor] Use KinetoEdgeCPUProfiler for operator profiling. (#63367 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63367 This diff changes the way operator profiling is done in lite predictor benchmarking binary. Instead of using custom callbacks it uses KinetoEdgeCPUProfiler to profile events and then generate operator level metric from it. Since KinetoEvents do not contain cpu clock time, now we report only wallclock time. This unifies various profiling effort that we have for benchmarking purpose. In production we will still use observer based mechanism, but the advantage of using kineto profiler is that we get few other things for free, such as: - chrome trace generation. - operator level memory profiling (to be added) - flop counts (to be added) Furthermore possible we can use python post processing script to parse chrome trace and generate output similar to torch.profiler. (To be done) Test Plan: aibench run Model without debug info: https://www.internalfb.com/intern/aibench/details/219598441154763 Model with debug info and `--print_module_info true` (see Operator summary has now module hierarchy information). https://www.internalfb.com/intern/aibench/details/617154236292985 Reviewed By: raziel Differential Revision: D30327514 fbshipit-source-id: 3bb2f2daaaedfb04bd6f5d9c91292783f9c4344f	2021-08-30 20:54:51 -07:00
Chen Lai	6bb68ba507	Fix interpreter debug logging message (#63499 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63499 https://github.com/pytorch/pytorch/pull/62418 combine the instruction and debug handle. This change fix the debugging message. ghstack-source-id: 136184053 Test Plan: Uncomment and it works Reviewed By: kimishpatel, raziel Differential Revision: D30390699 fbshipit-source-id: e32b7b297ad3b7d8bffebd025d15519083a244c4	2021-08-19 02:14:13 -07:00
Kimish Patel	38c185189c	[Pytorch Edge] Enable kineto profiler on mobile via EdgeKinetoProfiler (#62419 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62419 This diff adds support for cpu only kineto profiler on mobile. Thus enabling chrome trace generation on mobile. This bring cpp API for mobile profiling on part with Torchscript. This is done via: 1. Utilizating debug handle annotations in KinetoEvent. 2. Adding post processing capability, via callbacks, to KinetoThreadLocalState 3. Creating new RAII stype profiler, KinetoEdgeCPUProfiler, which can be used in surrounding scope of model execution. This will write chrome trace to the location specified in profiler constructor. Test Plan: MobileProfiler.ModuleHierarchy Imported from OSS Reviewed By: raziel Differential Revision: D29993660 fbshipit-source-id: 0b44f52f9e9c5f5aff81ebbd9273c254c3c03299	2021-08-13 21:40:19 -07:00
Kimish Patel	77a6436cac	[Pytorch Mobile] Combing instructions and debug hanles in single struct (#62418 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62418 Debug handles have one to one correspondence with instruction, so just combine them in one. Test Plan: CI Imported from OSS Reviewed By: raziel Differential Revision: D29993661 fbshipit-source-id: 125c7163174cf66624dd95f110fdc8208fea8a07	2021-08-13 21:40:17 -07:00
Richard Barnes	b5867a1b34	irange-ify 7 (#62117 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62117 Test Plan: Sandcastle Reviewed By: ngimel Differential Revision: D29879640 fbshipit-source-id: 189578a57301747a3421742e145bbcdf2ad75c49	2021-07-28 13:30:39 -07:00
Nikita Shulga	a9b0a921d5	Disable `avoid-non-const-global-variables` lint check (#62008 ) Summary: As GoogleTest `TEST` macro is non-compliant with it as well as `DEFINE_DISPATCH` All changes but the ones to `.clang-tidy` are generated using following script: ``` for i in `find . -type f -iname ".c" -or -iname "*.h"\|xargs grep cppcoreguidelines-avoid-non-const-global-variables\|cut -f1 -d:\|sort\|uniq`; do sed -i "/\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-non-const-global-variables)/d" $i; done ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/62008 Reviewed By: driazati, r-barnes Differential Revision: D29838584 Pulled By: malfet fbshipit-source-id: 1b2f8602c945bd4ce50a9bfdd204755556e31d13	2021-07-22 18:04:40 -07:00
Kimish Patel	d6d726f781	[Pytorch Backend delegation] Add api for backend lowering to query debug (#55462 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55462 handles and symbolicate exception callstack thrown from backend. Objective of this diff is to achieve improve error reporting when exceptions are raised from lowered backend. We would effectively like to get the same model level stack trace that you would get without having lowered some module to backend. For example: ``` class AA(nn.Module): def forward(self, x, y): return x + y class A(nn.Module): def __init__(...): self.AA0 = AA() def forward(self, x, y): return self.AA0.forward(x, y) + 3 class B(nn.Module): def forward(self, x): return x + 2 class C(nn.Module): def __init__(...): self.A0 = A() self.B0 = B() def forward(self, x, y): return self.A0.forward(x, y) + self.B0.forward(x) ``` If the we then do C().forward(torch.rand((2,3)), torch.rand(14,2))) we will likely see error stack like: ``` C++ exception with description "The following operation failed in the TorchScript interpreter. Traceback of TorchScript (most recent call last): File "<string>", line 3, in forward def forward(self, x, y): return self.A0.forward(x, y) + self.B0.forward(x) ~~~~~~~~~~~~~~~ <--- HERE File "<string>", line 3, in forward def forward(self, x, y): return self.AA0.forward(x, y) + 3 ~~~~~~~~~~~~~~~~ <--- HERE File "<string>", line 3, in forward def forward(self, x, y): return x + y ~~~~~ <--- HERE ``` We would like to see the same error stack if we lowered C.A0 to some backend. With this diff we get something like: ``` Module hierarchy:top(C).A0(backend_with_compiler_demoLoweredModule).AA0(AA) Traceback of TorchScript (most recent call last): File "<string>", line 3, in FunctionName_UNKNOWN def forward(self, x, y): return self.A0.forward(x, y) + self.B0.forward(x) ~~~~~~~~~~~~~~~ <--- HERE File "<string>", line 5, in FunctionName_UNKNOWN typed_inputs: List[Any] = [x, y, ] if self.__backend.is_available() : _0, = self.__backend.execute(self.__handles["forward"], typed_inputs) ~~~~~~~~~~~~~~~~~~~~~~ <--- HERE assert isinstance(_0, Tensor) return _0 File "<string>", line 3, in FunctionName_UNKNOWN def forward(self, x, y): return self.AA0.forward(x, y) + 3 ~~~~~~~~~~~~~~~~ <--- HERE File "<string>", line 3, in FunctionName_UNKNOWN def forward(self, x, y): return x + y ~~~~~ <--- HERE ``` This is achieved in 3 parts: Part 1: A. BackendDebugInfoRecorder: During backend lowering, in `to_backend`, before calling the preprocess function corresponding to the backend. This will facilitate recording of debug info (such as source range + inlined callstack) for the lowered module. B. Instantiate WithBackendDebugInfoRecorder with BackendDebugInfoRecorder. This initializes thread local pointer to BackendDebugInfoRecorder. C. generate_debug_handles: In preprocess function, the backend will call generate_debug_handles for each method being lowered separately. generate_debug_handles takes `Graph` of the method being lowered and returns a map of Node*-to-debug_handles. Backend is responsible for storing debug handles appropriately so as to raise exception (and later profiling) using debug handles when the exception being raised corresponds to particular Node that was lowered. Inside generate_debug_handles, we will query the current BackendDebugHandleInfoRecorder, that is issuing debug handles. This debug handle manager will issue debug handles as well as record debug_handles-to-<source range, inlined callstack> map. D. Back in `to_backend`, once the preprocess function is has finished lowering the module, we will call `stopRecord` on BackendDebugInfoRecorder. This will return the debug info map. This debug info is then stored inside the lowered module. Part 2: Serialization: During serialization for bytecode (lite interpreter), we will do two things: 1. Extract all the source ranges that are contained inside debug_handles-to-<source range, inlined callstack> map for lowered module. This will be source range corresponding to debug handles, including what is there is inlined callstack. Since we replaced original module with lowered module, we wont be serializing code for the original module and thus no source range. That is why the source range will have to be stored separately. We will lump all the source ranges for all the lowered modules in one single debug_pkl file. 2. Then we will serialize debug_handles-to-<source range, inlined callstack> map. Now during deserialization we will be able to reconstruct debug_handles-to-<source range, inlined callstack> map. Given all debug_handles are unique we would not need any module information. Test Plan: Tests are added in test_backend.cpp Tests are added in test_backend.cpp Imported from OSS Differential Revision: D27621330 D27621330 Reviewed By: raziel Pulled By: kimishpatel fbshipit-source-id: 0650ec68cda0df0a945864658cab226a97ba1890	2021-05-22 08:33:07 -07:00
Kimish Patel	f4a921600a	[PyTorch, Mobile] Serialization format change for source range (#54284 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54284 In order to bring mobile deployment, via lite interpreter, on feature parity with JIT, with respect model level debug information we must make model level debug information available to mobile runtime. At the moment, model level debug information is stored in SourceRange which associates node's of graph to where the come from in original python source code. This information is serialized as part of debug_pkl and deserialized when JIT loads the model and reads the model code. On lite interpreter, we do not have access to all the functionality of JIT and hence we cannot load model in the same way as JIT, by reading code, constructing module hierarchy and graph corresponding module methods etc. Instead in, lite interpreter, only bytecode corresonding to the compiled graph, Code, is saved. Thus in order to annotate OPs in the bytecode with equivalent SourceRange information we do the following: 1. During model serialization, we create a unique tag for each source range of the model. 2. Create a map of <SourceRange, tag> 3. During debug_pkl serialization we save tag along with SourceRange, on top of byte offset. 4. During bytecode generation, the methods of the top module are lowered. During this process methods are inlined. In the inlined graph, when the node of a graph is lowered to bytecode, we query node's source range and look it up against the map. 5. Resulting source range tag is serialized in module_debug_info. 6. During model deserialization, we read all the debug_pkl records in the archieve and create a map of <tag, SourceRange> 7. This map can be used to find source code information. During mobile runtime: 1. We read all the debug_pkl records and create <tag=debug_handle, SourceRange> map. 1.1 This map, MobileDebugInfo, is a member of mobile Module. 2. Interpreter catches appropriate exceptions and sets the thread local debug handle and rethrows the exception. 3. In Function's run method we catch exception and query current debug handle where the exception happened. 4. Query MobileDebugInfo with debug handle to retrieve source range and augment error with source range info. This information is still incomplete as it does not contain entire callstack. In the following diffs we will serialize InlinedCallStack directly. Note that compilation is gated by SYMBOLICATE_MOBILE_DEBUG_HANDLE macro, so that mobile builds can avoid building MobileDebugInfo, source range and source range pickler/unpickler. Later we will add path where, if building without debug support stack trace will contain only debug handles. They can be symbolicated later. Test Plan: Ported bunch of source range tests from test_jit.py. Added on more test in test_lite_interpreter.py Imported from OSS Reviewed By: raziel Differential Revision: D27174722 fbshipit-source-id: a7b7c6088ce16dec37e823c7fefa4f0b61047e12	2021-05-04 09:19:27 -07:00
Scott Wolchok	b87d3fa432	[PyTorch][jit] Don't allow create() on singleton types (#56807 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56807 If I understand correctly, there's no reason to create your own instance of these global singleton types. ghstack-source-id: 127312270 Test Plan: CI Reviewed By: SplitInfinity Differential Revision: D27973447 fbshipit-source-id: f12df69d185f1baaa45f2ac6eac70570a7a65912	2021-04-30 10:28:50 -07:00
Nikita Shulga	4cb534f92e	Make PyTorch code-base clang-tidy compliant (#56892 ) Summary: This is an automatic change generated by the following script: ``` #!/usr/bin/env python3 from subprocess import check_output, check_call import os def get_compiled_files_list(): import json with open("build/compile_commands.json") as f: data = json.load(f) files = [os.path.relpath(node['file']) for node in data] for idx, fname in enumerate(files): if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'): files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')] return files def run_clang_tidy(fname): check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"]) changes = check_output(["git", "ls-files", "-m"]) if len(changes) == 0: return check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"]) def main(): git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n") compiled_files = get_compiled_files_list() for idx, fname in enumerate(git_files): if fname not in compiled_files: continue if fname.startswith("caffe2/contrib/aten/"): continue print(f"[{idx}/{len(git_files)}] Processing {fname}") run_clang_tidy(fname) if __name__ == "__main__": main() ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892 Reviewed By: H-Huang Differential Revision: D27991944 Pulled By: malfet fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179	2021-04-28 14:10:25 -07:00
Scott Wolchok	3959d393b8	[PyTorch][JIT] Less shared_ptr use in dictConstruct (#54110 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54110 dictConstruct doesn't need to make its caller have a `shared_ptr<DictType>`. It also doesn't need to do extra `shared_ptr` copies into the `key_type` and `value_type` locals. ghstack-source-id: 124150642 Test Plan: fitsships Reviewed By: ezyang Differential Revision: D27101782 fbshipit-source-id: 3c632ad9d8f1bd7bdf37f517a86aca27bd41548a	2021-03-22 18:31:27 -07:00
Scott Wolchok	4a24c552cc	[PyTorch] Fix string copy in WARN path for both interpreters (#54076 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54076 If we don't constrain ourselves to use `torch::jit::pop`, we can avoid copying a string or moving IValues around. ghstack-source-id: 124040891 Test Plan: existing tests spot-checked regular interpreter assembly; seems better Reviewed By: dhruvbird, walterddr Differential Revision: D27087204 fbshipit-source-id: 7cf355dbcec31409bdb37afa09d7df85cf2a7e4b	2021-03-17 08:44:08 -07:00
Scott Wolchok	8f1af02f35	[PyTorch][mobile] Audit mobile interpreter for extra copies (#54031 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54031 Similar to D27060762 (`665d5e2a4f`), caught some probably-unintended copies. ghstack-source-id: 124040889 Test Plan: CI? Reviewed By: walterddr, iseeyuan Differential Revision: D27061818 fbshipit-source-id: f4a77cb5c21cd3ebce7b7e82764e4361467bab91	2021-03-17 08:42:34 -07:00
Dhruv Matani	17495e0318	[PyTorch Mobile] Fix case when error messages are stripped, and stack value isn't popped off in lite-interpreter (#53201 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53201 This resulted in [S22350](https://www.internalfb.com/intern/sevmanager/view/s/223540), which caused truoble on Android. 1. The Python has a call to `warnings.warn()`, which resulted in code generated to emit the `WARN` instruction on lite-interpreter. 2. The code for handling that instruction/op-code popped off the value in a call to the `TORCH_WARN()` macro. 3. This macro conditionally compiled out evaluation of the arguments if `STRIP_ERROR_MESSAGES` was defined, which resulted in the stack not getting popped, and the lite-interpreter returning the last pushed value on to the stack. I've attempted to re-produce it using this python code: {P243842428} ghstack-source-id: 122990001 (Note: this ignores all push blocking failures!) Test Plan: Created a new unit test to re-produce the failure in the test. Was able to do so locally using the following command: ``` buck test -c pt.strip_error_messages=1 //xplat/caffe2:test_s223540 ``` However, since `pt.strip_error_messages=0` for dev and continuous builds, I have had to check in a separate contbuild config to try and trigger this failure on contbuild. Reviewed By: iseeyuan Differential Revision: D26765662 fbshipit-source-id: 63c3c96d84ce6a9e5471f13d80165aa3718be9a2	2021-03-04 19:10:07 -08:00
Martin Yuan	b5ae8e69a7	[Lite Interpreter] Support features from to_backend (#52870 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52870 Add the missing parts to support to_backend modules by lite interpreter. 1. Add ISINSTANCE instruction support, which is used in to_backend for output type check. 2. Bypass lite interpreter's type parser by checking the qualified name. If it starts with "torch.jit", use the same type resolver as nn module (starting with "__torch__"). Tests Mobile module is serialized and loaded in ```BackendTest.TestCompiler```. The results are compared to those from original torchscript module. Test Plan: Imported from OSS Reviewed By: raziel Differential Revision: D26715351 Pulled By: iseeyuan fbshipit-source-id: ad9d74ee81c6aa692ab9e5dd7a9003bae5d4f01f	2021-03-01 17:56:01 -08:00
Martin Yuan	23c50a4a50	[PyTorch Mobile] Support torchbind custom classes in lite interpreter (#51432 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51432 ghstack-source-id: 120976584 torchbind is a convenient way to include custom class to both python and torchscript. CREATE_OBJECT is used to create an object of custom class. CREATE_OBJECT was not supported by lite interpreter. The major reason was that for custom class directly defined in Python, there's no language parser in lite interpreter. It's still the case. However, for torchbind classes that are defined in C++, a python/torchscript parser is not needed. This diff is to support the case of torchbind custom classes. 1. The class type can be resolved at import level. 2. If the class is not the supported torchbind class, an error message is provided at export stage. Workaround is also suggested. 3. Unit tests. C++: ```LiteInterpreterTest::BuiltinClass``` is added as an end-to-end test on supported class. Python: ```test_unsupported_createobject``` is changed to ```test_unsupported_classtype``` to test unsupported classes. Test Plan: CI Reviewed By: raziel Differential Revision: D26168913 fbshipit-source-id: 74e8b6a12682ad8e9c39afdfd2b605c5f8e65427	2021-02-03 21:57:19 -08:00
Andres Suarez	8530c65e25	[codemod][fbcode/caffe2] Apply clang-format update fixes Test Plan: Sandcastle and visual inspection. Reviewed By: igorsugak Differential Revision: D25849205 fbshipit-source-id: ef664c1ad4b3ee92d5c020a5511b4ef9837a09a0	2021-01-09 14:37:36 -08:00
Scott Wolchok	ef1fa547ba	[PyTorch] Use expectRef() when calling listConstruct (#50062 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50062 Avoids creating an extra shared_ptr. ghstack-source-id: 119325645 Test Plan: CI Reviewed By: ezyang Differential Revision: D25766631 fbshipit-source-id: f2ab8349dfea325054820fa2c1055180c740574e	2021-01-06 18:13:38 -08:00

1 2

82 Commits