pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Scott Wolchok	b87d3fa432	[PyTorch][jit] Don't allow create() on singleton types (#56807 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56807 If I understand correctly, there's no reason to create your own instance of these global singleton types. ghstack-source-id: 127312270 Test Plan: CI Reviewed By: SplitInfinity Differential Revision: D27973447 fbshipit-source-id: f12df69d185f1baaa45f2ac6eac70570a7a65912	2021-04-30 10:28:50 -07:00
Luca Wehrstedt	311ad5e3af	Merge CUDAFuture into ivalue::Future (#57052 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57052 This PR caps a stack whose goal was to merge CUDAFuture into ivalue::Future. CUDAFuture used to be a subclass of ivalue::Future, which was already pretty good, but it meant that in several places we needed `#ifdef`s or registries in order to create the right type of class, which was annoying. We've made CUDAFuture device-agnostic, by using generic helpers, so that it doesn't depend on CUDA. Now all its code can be inserted into ivalue::Future. This PR does this very naively, by copy-pasting CUDAFuture's code into the (previously empty) virtual methods of ivalue::Future. This helps ensure the correctness of this PR, as it's straightforward to see it behaves exactly like before. However we probably want to polish it a bit later to iron out so wrinkles. ghstack-source-id: 127713138 (Note: this ignores all push blocking failures!) Test Plan: CI Reviewed By: mrshenli Differential Revision: D28036829 fbshipit-source-id: 3e5b16402f5dc245c1fcb9d7bf06db64dcb0d2a3	2021-04-29 09:31:52 -07:00
Luca Wehrstedt	71c2f88b90	Make CUDAFuture handle any kind of device type (#57051 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57051 Make CUDAFuture autodetect the devicetype from its arguments (which thus change from DeviceIndices to full Devices). This in fact transforms CUDAFuture into a AnythingFuture, since it's not tied to CUDA in any way anymore. Having made it fully device-agnostic, we'll merge it into ivalue::Future in the next PR. ghstack-source-id: 127713134 (Note: this ignores all push blocking failures!) Test Plan: CI Reviewed By: mrshenli Differential Revision: D28032711 fbshipit-source-id: 8ba23b1b0d97f61db8693cd5f3c7bae7989a9bcd	2021-04-29 09:31:50 -07:00
Nikita Shulga	4cb534f92e	Make PyTorch code-base clang-tidy compliant (#56892 ) Summary: This is an automatic change generated by the following script: ``` #!/usr/bin/env python3 from subprocess import check_output, check_call import os def get_compiled_files_list(): import json with open("build/compile_commands.json") as f: data = json.load(f) files = [os.path.relpath(node['file']) for node in data] for idx, fname in enumerate(files): if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'): files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')] return files def run_clang_tidy(fname): check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"]) changes = check_output(["git", "ls-files", "-m"]) if len(changes) == 0: return check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"]) def main(): git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n") compiled_files = get_compiled_files_list() for idx, fname in enumerate(git_files): if fname not in compiled_files: continue if fname.startswith("caffe2/contrib/aten/"): continue print(f"[{idx}/{len(git_files)}] Processing {fname}") run_clang_tidy(fname) if __name__ == "__main__": main() ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892 Reviewed By: H-Huang Differential Revision: D27991944 Pulled By: malfet fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179	2021-04-28 14:10:25 -07:00
Jacob Szwejbka	60a5ebfac2	[Pytorch Edge] Remove methods_to_optimize arg (#57045 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57045 Went back and adjusted the previous optimizations to just be applied to every function. Cleaned up api to match. ghstack-source-id: 127214412 ghstack-source-id: 127536155 Test Plan: unit test Reviewed By: kimishpatel Differential Revision: D27950859 fbshipit-source-id: 214e83d5a19b452747fe223615815c10fa4aee58	2021-04-27 14:54:13 -07:00
Pritam Damania	dc8a8cea79	Move caffe2 signal_handler to c10. (#56717 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56717 The signal_handler was under the caffe2 namespacee but was being used by PyTorch as well. I've fixed this my moving it to the c10 namespace where now both C2 and PyTorch can use it. The signal_handler interface in caffe2/utils/signal_handler.h is kept the same for backward compatiblity for C2, but most of the commmon code is moved to c10. ghstack-source-id: 127446929 Test Plan: waitforbuildbot Reviewed By: ezyang Differential Revision: D27946738 fbshipit-source-id: d6228d1a0108f4c807d405e7a0bb799c5375388f	2021-04-26 23:08:12 -07:00
Luca Wehrstedt	a688b29750	Support custom Python classes in CUDAFuture (#56516 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56516 One problem with CUDAFuture's extraction of DataPtrs from IValues is that it only supported Python objects that could be converted to "regular" IValues (e.g., lists/dicts/tuples of ints/strings/tensors/...). One notable exception are custom Python classes, which are in fact a very common data type transferred over RPC. The only solution we found for those is to use the Python pickler to extract the tensors contained in them. We can't insert a Python dependency directly into CUDAFuture, so instead I'm proposing to use the same indirection technique used to support `getSubValues` on Python objects: define some methods on the abstract class `PyObjectHolder` (which can be used by CUDAFuture) but only implement them in the concrete subclass `ConcretePyObjectHolder` (which is only built when Python support is enabled). I am a bit worried about the performance toll of this (pickling isn't exactly known to be cheap) but I think we should start by providing a functionally complete API. We already have ideas on how to make this faster if needed, for example by having users provide a custom DataPtr extractor tailored to their class via a decorator. (Or just use TorchScript). ghstack-source-id: 127295014 Test Plan: Added a test later in the stack Reviewed By: mrshenli Differential Revision: D27887189 fbshipit-source-id: 9d27e4e62390b836e5bb4f06f401cc002f0cf95b	2021-04-24 07:06:28 -07:00
Luca Wehrstedt	15ca379bde	Add CUDA support to a user-created torch.futures.Future (#56517 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56517 Currently a torch.futures.Future could wrap a CUDAFuture, but it could not create one from scratch. This prevented users from using CUDAFutures in some occasions, for example when using `rpc.functions.async_execution`, or in their own code. I don't see any reason for such a limitation, hence here I add support for this. ghstack-source-id: 127261554 Test Plan: Added a test later in the stack Reviewed By: mrshenli Differential Revision: D27887190 fbshipit-source-id: ecbb39c1ad7cd189d478ded9c361448f05a270ad	2021-04-23 08:13:56 -07:00
BowenBao	818ce1d0d2	Add standardOps match more input type in ORT (#53813 ) (#56172 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56172 Enable the standardOps include Add\Sub\Mul\Div\Gemm\Pow\Mod with low precision input in ORT Test Plan: Imported from OSS Reviewed By: pbelevich Differential Revision: D27866136 Pulled By: SplitInfinity fbshipit-source-id: f2cf5649fffefd68c0cc7b6dce94198751636727	2021-04-21 17:58:08 -07:00
BowenBao	9986b109d2	[ONNX] Fix assign input shape for tuple inputs & primitive type inputs (#54112 ) (#56164 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56164 Test Plan: Imported from OSS Reviewed By: pbelevich Differential Revision: D27866139 Pulled By: SplitInfinity fbshipit-source-id: c59f5a07df685e1ccdc4860d603ec422ec80d188	2021-04-20 23:00:37 -07:00
Zhengxu Chen	8176ab6ca0	[JIT] Put explicit error message on class attribute accesses. (#55723 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55723 Resolving https://github.com/pytorch/pytorch/issues/51139 Test Plan: python test/test_jit.py TestClassType.test_unresolved_attributes Imported from OSS Reviewed By: gmagogsfm Differential Revision: D27691960 fbshipit-source-id: 1d078a4ab25af1a73109ca6ef0333a67a634bff6	2021-04-16 15:47:10 -07:00
Bert Maher	8e82e932f3	Reland: D27652485: [nnc] Enable CPU fusion only when num_threads == 1" (#56120 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56120 This reverts commit `ad17fadbfc` (D27786457). The big annoyance here is that depending on the threading mode you may not be able to toggle num_threads at will, so the fusion tests won't fail. I hate this solution, but I'm adding a secondary override for the TE fuser. Now you need to both turn on fusion (_jit_override_can_fuse_on_cpu), and you're OK if you're running with 1 thread, or you can add `_jit_set_texpr_parallel_cpu_enabled` to enable it anyways. This is (a) mainly for tests, since a real user probably won't fiddle aimlessly with the thread count, and (b) will go away once NNC's threading support is fully baked. Test Plan: Imported from OSS Reviewed By: Krovatkin Differential Revision: D27788199 Pulled By: bertmaher fbshipit-source-id: 070d04474f15e9689dbdf8cc1fde43050c6506b1	2021-04-15 15:50:18 -07:00
Edward Yang	6ec71ed4f9	Replace all direct cdata access with THPVariable_Unpack (#55799 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55799 I'm going to change the implementation of cdata soon so I need to abstract over cdata access with a function. Additionally, many users are casting manually casting to THPVariable to access the member so I can remove these unsafe casts in the client code (the implementation, of course, is still doing an unsafe cast.) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D27712130 Pulled By: ezyang fbshipit-source-id: 95fcc013bf3913d67f2c634068eb5b3aab144cb3	2021-04-15 08:57:04 -07:00
James Reed	71a5314591	Fix ScriptMethod dispatch on __torch_function__ (#56103 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56103 Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D27784142 Pulled By: jamesr66a fbshipit-source-id: 555dcb7c3a98b8fb9e9ca9b499cafad54e819aa7	2021-04-15 08:46:43 -07:00
Nikitha Malgi	88c06d9dfc	Add cuda device synchronization support in JIT (#55469 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55469 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D27749077 Pulled By: nikithamalgifb fbshipit-source-id: bce3d331ab781cf3232b47b4f02ef504b9eadc7e	2021-04-14 09:13:07 -07:00
Nikita Shulga	6a39613f35	[BE] Make torch/csrc/jit/tensorexpr/ clang-tidy clean (#55628 ) Summary: Mostly auto-generated changes using ``` python3 tools/clang_tidy.py -c build -x torch/csrc/jit/tensorexpr/eval.cpp -s ``` With following common patterns manually fixed - Use ` = default` instead of `{}` - deleted methods should be public - Use pass-by-value + std::move instead of pass-by-reference+copy Pull Request resolved: https://github.com/pytorch/pytorch/pull/55628 Reviewed By: walterddr Differential Revision: D27655378 Pulled By: malfet fbshipit-source-id: 92be87a08113435d820711103ea9b0364182c71a	2021-04-08 19:44:14 -07:00
Jacob Szwejbka	20d7916a6a	[Pytorch Mobile] Fold Conv BatchNorm for functions besides forward (#54619 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54619 Minor refactor to conv batchnorm folding to work on other functions besides forward ghstack-source-id: 125767010 Test Plan: unit test and {P339453712} Reviewed By: kimishpatel Differential Revision: D27301452 fbshipit-source-id: 4e0cc544a171a970583979a496b2908935124497	2021-04-06 13:07:12 -07:00
Nikitha Malgi	197f9f0826	Merge CUDA Streams and Events (#53902 ) Summary: ----------- - Updates current_stream and default stream API's to take `optional[device]` argument - Adds parsing logic to replace `torch.cuda.Stream` and `torch.cuda.Event` -> `torch.classes.cuda.Stream` and `torch.classes.cuda.Event` for JIT - Merges StreamContext manager for both Eager and JIT. Pull Request resolved: https://github.com/pytorch/pytorch/pull/53902 Test Plan: ------ Run JIT tests: python test/test_jit.py -v TestCUDA Run eager tests: python test/test_cuda.py -v TestCuda Reviewed By: glaringlee Differential Revision: D27494627 Pulled By: nikithamalgifb fbshipit-source-id: b30b0570e38a33fb335c83762eb06ffd46a44b5c	2021-04-05 08:19:55 -07:00
Mike Ruberry	c0ac0fef4e	Revert D27448156: irange for size_t Test Plan: revert-hammer Differential Revision: D27448156 (`041b4431b2`) Original commit changeset: 585da57d4de9 fbshipit-source-id: 8e047c29f391c0166e0a1a87c3fb2a0854377365	2021-04-03 19:14:00 -07:00
Richard Barnes	041b4431b2	irange for size_t (#55163 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55163 Test Plan: Sandcastle Reviewed By: ngimel Differential Revision: D27448156 fbshipit-source-id: 585da57d4de91c692b6360d65f7b8a66deb0f8c1	2021-04-02 23:22:29 -07:00
Meghan Lele	6866c033d5	[JIT] Add recursive scripting for class type module attributes (#55124 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55124 Summary This commit modifies type inference (used by the module scripting code) so that it tries to script the type of any class instances that it encounters. This enables recursive, automatic scripting of class type module attributes. Test Plan This commit adds a test case for this to `TestClassType`. Test Plan: Imported from OSS Reviewed By: gmagogsfm Differential Revision: D23971883 Pulled By: SplitInfinity fbshipit-source-id: 7a5a2e7c12ee68cbdeb0a07e6aaf98734a79cb06	2021-04-02 12:16:21 -07:00
Negin Raoof	cd9dd653e9	[ONNX] Support primitive type input/outputs and attributes (#53550 ) (#54864 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54864 Support primitive type attributes. Needed for Silero model. Test Plan: Imported from OSS Reviewed By: nikithamalgifb Differential Revision: D27408982 Pulled By: SplitInfinity fbshipit-source-id: 16b291eedbe9f9bb31d7664a29a484555df53755	2021-03-31 21:14:20 -07:00
Rohan Varma	a37fbf9b45	[Futures] Bump log verbosity when ignoring cb errors in python future. (#54476 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54476 Per title. For `add_done_callback`, we log but swallow exceptions in order to keep consistent with what concurrent.futures python library does, see discussion in https://github.com/pytorch/pytorch/pull/45675. Although, it would be good to improve the verbosity here as this can be a source of confusion if users are setting a different future via `add_done_callback`, and an error is hit resulting in an unexpected hang (see https://github.com/pytorch/pytorch/issues/52132 for more details on how this can happen). ghstack-source-id: 125300389 Test Plan: CI Reviewed By: lw Differential Revision: D27253004 fbshipit-source-id: 72ed21c8fb6d27de5797c17fc46b762f893e6fea	2021-03-31 15:17:06 -07:00
Jianyu Huang	7fc03dd7c9	Back out "[pytorch][PR] Merge CUDA Streams and Events" (#54996 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54996 Original commit changeset: 45d9fee9a582 Test Plan: CI Reviewed By: jspark1105 Differential Revision: D27444718 fbshipit-source-id: deb627230817923eaf84ade50ecb14bfbce4e779	2021-03-31 10:21:35 -07:00
Michael Suo	8a170fbacd	[package] fix mangling issues with TorchScript (#54915 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54915 TorchScript and torch.package have different mangling schemes. To avoid them interfering with each other, we should undo the torch.package mangling before processing anything with TorchScript (since TS independently makes sure that no names collide). Test Plan: Imported from OSS Reviewed By: SplitInfinity Differential Revision: D27410472 Pulled By: suo fbshipit-source-id: d1cc013c532d9abb7fb9615122bc465ded4785bb	2021-03-31 00:58:05 -07:00
anjali411	1bccd48465	Allow creating SugaredValue for a complex valued IValue and deserialization logic for "infj" and "nanj" global constants (#54328 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54328 Test Plan: Imported from OSS Reviewed By: nikithamalgifb Differential Revision: D27369134 Pulled By: anjali411 fbshipit-source-id: aec26750a6fc8917ee15306684b743d13a91570c	2021-03-29 14:46:29 -07:00
Nikitha Malgi	416ba5c48f	Merge CUDA Streams and Events (#53902 ) Summary: ----------- - Updates current_stream and default stream API's to take `optional[device]` argument - Adds parsing logic to replace `torch.cuda.Stream` and `torch.cuda.Event` -> `torch.classes.cuda.Stream` and `torch.classes.cuda.Event` for JIT - Merges StreamContext manager for both Eager and JIT. Pull Request resolved: https://github.com/pytorch/pytorch/pull/53902 Test Plan: ------ Run JIT tests: python test/test_jit.py -v TestCUDA Run eager tests: python test/test_cuda.py -v TestCuda Reviewed By: SplitInfinity Differential Revision: D27285996 Pulled By: nikithamalgifb fbshipit-source-id: 45d9fee9a582b5f4c82330f5f99eb88584804270	2021-03-26 14:19:39 -07:00
anjali411	f9ca0d87a7	Teach Python TS frontend to parse complex literals (#52881 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52881 This PR adds: 1. logic to parse complex constants (complex literals of the form `bj`) 2. logic to parse complex lists 3. support for complex constructors: `complex(tensor/int/float/bool, tensor/int/float/bool)` 4. Limited operator support - `add`, `sub`, `mul`, `torch.tensor`, `torch.as_tensor` Follow-up work: 1. Add complex support for unary and other registered ops. 2. support complex constructor with string as input (this is supported in Python eager mode). 3. Test all emitXYZ for all XYZ in `ir_emitter.cpp` (currently only emitConst, emitValueToTensor are tested). e.g., test loops etc. 4. onnx doesn't support complex tensors, so we should error out with a clear and descriptive error message. Test Plan: Imported from OSS Reviewed By: bdhirsh Differential Revision: D27245059 Pulled By: anjali411 fbshipit-source-id: af043b5159ae99a9cc8691b5a8401503fa8d6f05	2021-03-24 08:12:17 -07:00
Christian Puhrsch	2668149b8c	Export torch::jit::toIValue (#54449 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/54448 Pull Request resolved: https://github.com/pytorch/pytorch/pull/54449 Reviewed By: SplitInfinity Differential Revision: D27243154 Pulled By: cpuhrsch fbshipit-source-id: fc21d6ce251b868356ad8ea13ae891fb56e311ce	2021-03-22 17:17:18 -07:00
Bin Bao	4626886f21	[JIT] Add CUDNN Conv-Add-Relu fusion for Frozen Model Optimization (#52102 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52102 Test Plan: Imported from OSS Reviewed By: eellison Differential Revision: D26646100 fbshipit-source-id: 7f7a82cc0b42c958b9e0c854b3b5dc6ea7cfff6c	2021-03-18 15:18:52 -07:00
James Reed	255b103c1b	[WIP] Function to retrieve inspect.Signature instances for PyTorch ops (#53830 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53830 Test Plan: Imported from OSS Reviewed By: suo Differential Revision: D26982802 Pulled By: jamesr66a fbshipit-source-id: 18fddc9f3f34b09e173de59f2fe886f8eedd000e	2021-03-17 20:41:27 -07:00
Jacob Szwejbka	8f61b13e80	[Pytorch Mobile] Optimize Non Forward for Mobile (#53314 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53314 Introduction of api for optimizing non forward functions for mobile. As of this diff, all functions that you say to optimize will be preserved, and those functions will be run through canonical optimization. The intention is to stack each further optimization onto separate diffs since they touch multiple files, and it seems like it'd be a nightmare to review. ghstack-source-id: 123909414 Test Plan: torch.utils.mobile_optimizer.optimize_for_mobile(net, methods_to_optimize=["forward", "foo"]) runs fine torch.utils.mobile_optimizer.optimize_for_mobile(net, methods_to_optimize={"foo"}) optimizes just foo if the model doesnt define forward otherwise optimizes foo and forward torch.utils.mobile_optimizer.optimize_for_mobile(net, methods_to_optimize=["forward"]) runs fine torch.utils.mobile_optimizer.optimize_for_mobile(net) runs fine if the model defines forward, Throws otherwise Reviewed By: kimishpatel Differential Revision: D26618689 fbshipit-source-id: 5bff1fb3f3f6085c4a649a8128af9c10f0fa9400	2021-03-17 14:31:24 -07:00
Thomas Viehmann	fd5c1123e4	wrap AliasDb in Python (#51336 ) Summary: Also added a wrapper tlemo 's graphviz export to string. Pull Request resolved: https://github.com/pytorch/pytorch/pull/51336 Reviewed By: ezyang Differential Revision: D26150809 Pulled By: eellison fbshipit-source-id: 9beafce5cbdc1785b986b71c3cd986c1087faa11	2021-03-17 12:55:22 -07:00
BowenBao	57d1df071f	[ONNX] Support inplace operations on inplace indexing (#52063 ) (#53306 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53306 * [ONNX] Fix for sequence of mutations in blocks (#51577) Fixes consecutive mutations in a tensor inside blocks. Also, support append and pop in blocks. * Support inplace operations + indexing * Clean up old pass for remove mutations * Add loop test * Fixes for set attr in loops * Removing the new jit API flag * [ONNX] Redesign onnx pass to enable shape type dependent pattern conversion - cont (#51795) With the introduction of ONNX shape inference, shape and type are inferred on the fly as operators get converted from ATen to ONNX when running symbolic function. This resolves the shape/type requirement for the symbolic functions. The pre-onnx passes however, can not be supported by shape inference, since at that stage the operators in the graph are still ATen operators. This PR is to update the design of ONNX pass, to enable a mechanism of capturing subgraphs of ATen operators of certain patterns, and convert them later, when shape/type information of upstream operators are available. The new design will require pre-onnx passes that need shape/type to be written in two parts, encapsulation and conversion. The encapsulation part will find the nodes of patterns, like how pre-onnx passes were written previously. But instead of converting the nodes, it will encapsulate them into a sub-block of a new placeholder node. This part is called before onnx pass, so it runs before calling symbolic functions. The conversion part will be called inside the onnx pass. In onnx pass, run_symbolic_func will be called for each node in topological order. When it reaches the placeholder node, the conversion part will be invoked. It will convert the nodes inside the sub-block based on pattern. By that time, it will have shape/type of upstream operators available. After the conversion is complete, the placeholder node will be removed, and nodes inside its sub-block converted. Run_symbolic_func will be called for these nodes, and they will be converted from ATen operator to ONNX operator. This PR includes several other fixes, listed below. * ~~replace helper.cpp with onnx_utils.cpp for holding utility functions.~~ * fix EraseNumberTypes on Bool type, the code was outdated that back then Bool type doesn't exist. * ~~enable onnx shape inference in export with parameter/initializer data.~~ * other code clean ups. * fix insertion of identity nodes for loop opset 13 sequence output. ~~PR depends on #51603~~ * Fix after merge * clang * Fix clang * Fix clang * Fix warning message. * Fixes for non-model param attributes * Fix for caffe2 * Additional test * clang * Skip test for lower opsets * fix clang-tidy * Update init.cpp * Update remove_inplace_ops_for_onnx.cpp * Update remove_inplace_ops_for_onnx.cpp * Update remove_inplace_ops_for_onnx.cpp * Fix for clang formatting Test Plan: Imported from OSS Reviewed By: pbelevich, malfet Differential Revision: D26922416 Pulled By: SplitInfinity fbshipit-source-id: e7108620b39b6404c594910786c4d275fee59d84 Co-authored-by: Bowen Bao <bowbao@microsoft.com>	2021-03-12 02:49:11 -08:00
BowenBao	3f9c803fe8	[ONNX] Redesign onnx pass to enable shape type dependent pattern conversion - cont (#51795 ) (#53304 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53304 With the introduction of ONNX shape inference, shape and type are inferred on the fly as operators get converted from ATen to ONNX when running symbolic function. This resolves the shape/type requirement for the symbolic functions. The pre-onnx passes however, can not be supported by shape inference, since at that stage the operators in the graph are still ATen operators. This PR is to update the design of ONNX pass, to enable a mechanism of capturing subgraphs of ATen operators of certain patterns, and convert them later, when shape/type information of upstream operators are available. The new design will require pre-onnx passes that need shape/type to be written in two parts, encapsulation and conversion. The encapsulation part will find the nodes of patterns, like how pre-onnx passes were written previously. But instead of converting the nodes, it will encapsulate them into a sub-block of a new placeholder node. This part is called before onnx pass, so it runs before calling symbolic functions. The conversion part will be called inside the onnx pass. In onnx pass, run_symbolic_func will be called for each node in topological order. When it reaches the placeholder node, the conversion part will be invoked. It will convert the nodes inside the sub-block based on pattern. By that time, it will have shape/type of upstream operators available. After the conversion is complete, the placeholder node will be removed, and nodes inside its sub-block converted. Run_symbolic_func will be called for these nodes, and they will be converted from ATen operator to ONNX operator. This PR includes several other fixes, listed below. * ~~replace helper.cpp with onnx_utils.cpp for holding utility functions.~~ * fix EraseNumberTypes on Bool type, the code was outdated that back then Bool type doesn't exist. * ~~enable onnx shape inference in export with parameter/initializer data.~~ * other code clean ups. * fix insertion of identity nodes for loop opset 13 sequence output. ~~PR depends on #51603~~ Test Plan: Imported from OSS Reviewed By: SplitInfinity Differential Revision: D26922417 Pulled By: malfet fbshipit-source-id: 14ed06158d539e2451c2e5e63ba1b32fb0f75095	2021-03-11 10:30:09 -08:00
Nikitha Malgi	cfaa0bf286	[JIT] Update Namespace from cuda to _cuda (#53378 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53378 Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D26970607 Pulled By: nikithamalgifb fbshipit-source-id: 20a55dd9c0071c5870a4b176d30cb9c1e1496687	2021-03-11 00:52:01 -08:00
Michael Suo	b4d8f4af82	[package] implement `get_resource_reader` API (#51674 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51674 See https://docs.python.org/3/library/importlib.html#importlib.abc.ResourceReader Test Plan: Imported from OSS Reviewed By: zdevito Differential Revision: D26237034 Pulled By: suo fbshipit-source-id: 4c19f6172d16b710737528d3de48372873b9368d	2021-03-10 12:11:11 -08:00
Meghan Lele	60ed8fb244	[JIT] Enable ModuleList non-literal indexing (#53410 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53410 Summary This commit enables indexing into `ModuleList` using a non-literal index if the LHS of the assignment statement of which the indexing is the RHS is annotated with an interface type. This feature already exists for `ModuleDict`, and this commit builds on top of that implementation. A `prim::ModuleContainerIndex` operator is emitted for any statement of the form `lhs: InterfaceType = module_container[idx]`. The same operator has to be used for both `ModuleDict` and `ModuleList` because serialization does not preserve the metadata that indicates whether a `Module` is a `ModuleDict` or `ModuleList`. Testing This commit extends the existing unit tests for non-literal `ModuleDict` indexing to test non-literal `ModuleList` indexing. Fixes This commit fixes #47496. Test Plan: Imported from OSS Reviewed By: gmagogsfm Differential Revision: D26857597 Pulled By: SplitInfinity fbshipit-source-id: d56678700a264d79aae3de37ad6b08b080175f7c	2021-03-09 16:11:34 -08:00
Sean Silva	34d9278c19	Remove notion of "level" from `Module::dump_to_str`. (#52539 ) Summary: The code uses `torch::jit::jit_log_prefix` for handling recursive indenting in most places in this function. There was one place that was using "level", but it was buggy -- it would result in a compounding superlinear indent. Note that changing it to "level+1" doesn't fix the bug. Before/after: https://gist.github.com/silvasean/8ee3ef115a48de6c9c54fbc40838d8d7 The new code establishes a recursive invariant for `Module::dump_to_str`: the function returns the module printed at the base indent level (i.e. no indent). `torch::jit:log_prefix` is used to prefix recursive calls. The code was already nearly there, except for this spurious use of "level". Pull Request resolved: https://github.com/pytorch/pytorch/pull/52539 Reviewed By: navahgar Differential Revision: D26773657 Pulled By: gmagogsfm fbshipit-source-id: ab476f0738bf07de9f40d168dd038dbf62a9a79e	2021-03-09 05:45:57 -08:00
Raghavan Raman	d3cde6c23c	[NNC] Implementation for aten::cat without conditionals. (#53128 ) Summary: This PR adds an implementation for `aten::cat` in NNC without any conditionals. This version is not enabled by default. Here is the performance of some micro benchmarks with and without conditionals. There is up to 50% improvement in performance without conditionals for some of the shapes. aten::cat implementation in NNC with conditionals ``` $ python -m benchmarks.tensorexpr --device cpu --mode fwd --jit_mode trace --cpu_fusion concat pt: concat2d2input_fwd_cpu_1_160_1_14_1: 5.44 us, SOL 0.26 GB/s, algorithmic 0.51 GB/s pt: concat2d2input_fwd_cpu_1_580_1_174_1: 5.75 us, SOL 1.05 GB/s, algorithmic 2.10 GB/s pt: concat2d2input_fwd_cpu_20_160_20_14_1: 6.87 us, SOL 4.05 GB/s, algorithmic 8.11 GB/s pt: concat2d2input_fwd_cpu_20_580_20_174_1: 14.52 us, SOL 8.31 GB/s, algorithmic 16.62 GB/s pt: concat2d2input_fwd_cpu_8_512_8_512_1: 9.58 us, SOL 6.84 GB/s, algorithmic 13.68 GB/s ``` aten::cat implementation in NNC without conditionals ``` $ python -m benchmarks.tensorexpr --device cpu --mode fwd --jit_mode trace --cpu_fusion --cat_wo_conditionals concat pt: concat2d2input_fwd_cpu_1_160_1_14_1: 4.67 us, SOL 0.30 GB/s, algorithmic 0.60 GB/s pt: concat2d2input_fwd_cpu_1_580_1_174_1: 5.65 us, SOL 1.07 GB/s, algorithmic 2.14 GB/s pt: concat2d2input_fwd_cpu_20_160_20_14_1: 6.10 us, SOL 4.56 GB/s, algorithmic 9.12 GB/s pt: concat2d2input_fwd_cpu_20_580_20_174_1: 7.44 us, SOL 16.22 GB/s, algorithmic 32.44 GB/s pt: concat2d2input_fwd_cpu_8_512_8_512_1: 6.46 us, SOL 10.14 GB/s, algorithmic 20.29 GB/s ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/53128 Reviewed By: bertmaher Differential Revision: D26758613 Pulled By: navahgar fbshipit-source-id: 00f56b7da630b42bc6e7ddd4444bae0cf3a5780a	2021-03-07 22:57:02 -08:00
James Reed	1fe6a6507e	[WIP][FX] Fix tracing support for torchbind (#52884 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52884 Test Plan: Imported from OSS Reviewed By: gmagogsfm Differential Revision: D26675801 Pulled By: jamesr66a fbshipit-source-id: 8e5100bcea17589a53163abf6ab991658e11fa3a	2021-03-05 23:40:16 -08:00
Bram Wasti	56f8379802	[static runtime] Move all heavy constructor logic into InferenceModule (renamed to StaticModule) (#51564 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51564 Constructor logic was spread throughout InferenceModule and StaticRuntime. This diff unifies the two. After a lot of discussion on this diff D25961626 it became apparent that `clone` is uglier than a cheap StaticRuntime. This means StaticRuntime is effectively StaticModule and the only code in the new StaticRuntime is the `run` functions. ``` graph, schema = PrepareForStaticModule(torchscript_module) sm = StaticModule(graph, schema, options) sm(inputs) // or create many cheap runtimes with the module sr = StaticRuntime(sm) sr(inputs) ``` Changelist: - Rename InferenceModule StaticModule - Move all logic for construction into StaticModule - Create a new StaticRuntime that only has a unique memory planner (everything else is in StaticModule) - Update comments with explanation - Propagate all changes to predictor integration - Propagate all changes to python integration - Change semantics to be a bit more PyTorch-standard (no "run" calls, no "get_" getters). Test Plan: buck test //caffe2/test:static_runtime buck test caffe2/benchmarks/static_runtime:static_runtime_cpptest Reviewed By: hlu1 Differential Revision: D25592967 fbshipit-source-id: 8233bed03137ce129137af2d44bce0095033ef0f	2021-03-05 10:15:26 -08:00
Joel Schlosser	6557ea0509	Context manager for hiding source ranges (#53188 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/52456 ## Background Provides a context manager `_hide_source_ranges()` that disables printing graph source ranges by default. It can be overridden on a per-graph basis if desired. Pull Request resolved: https://github.com/pytorch/pytorch/pull/53188 Test Plan: ``` python test/test_jit.py TestJit.test_hide_source_ranges_context_manager ``` ```python import torch torch.jit.script def foo(x): return torch.add(x, x) print(foo.graph) with torch.jit._hide_source_ranges(): print(foo.graph) # Override context manager print(foo.graph.str(print_source_ranges=True)) print(foo.graph) ``` ``` graph(%x.1 : Tensor): %3 : int = prim::Constant[value=1]() %4 : Tensor = aten::add(%x.1, %x.1, %3) # /Users/jbschlosser/misc/example.py:5:11 return (%4) graph(%x.1 : Tensor): %3 : int = prim::Constant[value=1]() %4 : Tensor = aten::add(%x.1, %x.1, %3) return (%4) graph(%x.1 : Tensor): %3 : int = prim::Constant[value=1]() %4 : Tensor = aten::add(%x.1, %x.1, %3) # /Users/jbschlosser/misc/example.py:5:11 return (%4) graph(%x.1 : Tensor): %3 : int = prim::Constant[value=1]() %4 : Tensor = aten::add(%x.1, %x.1, %3) # /Users/jbschlosser/misc/example.py:5:11 return (%4) ``` Reviewed By: walterddr, zhangguanheng66 Differential Revision: D26817070 Pulled By: jbschlosser fbshipit-source-id: e9d123452c616b0a9dda9e134ef6c2886f229d9b	2021-03-04 09:11:08 -08:00
Tugsbayasgalan Manlaibaatar	4008df3507	Add property binding in torchbind (#50670 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50670 This PR adds property support to Torchbind. There are two cases that it needs to work: Torchscript Inside Torchscript, we don't go through pybind so there is no issue with accessing properties through ClassType. Eager Mode In Eager Mode, Torchbind creates ScriptObject which we cannot dynamically add (aka access) properties after initializing it. (https://stackoverflow.com/questions/1325673/how-to-add-property-to-a-class-dynamically ) Therefore we created a Python wrapper (ScriptObjectWrapper) around ScriptObject where we can use property method to set properties. By doing so, we can look up wrapped object's property through __getattr__ method of the ScriptObjectWrapper. This logic is inspired from https://github.com/pytorch/pytorch/pull/44324 Test Plan: test cases in test_torchbind.py Imported from OSS Reviewed By: pbelevich Differential Revision: D26632781 fbshipit-source-id: dd690887cfda0c48ff0d104aa240ce0ab09055bc	2021-03-03 14:25:52 -08:00
Elias Ellison	bfae3789ba	Move conv to mkldnn (#51483 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51483 This PR moves the conv weights of a frozen model to MKLDNN, and AOT reorders the weights. When the weights are already in MKLDNN, just computing a single conv by converting the input and output from/to mkldnn provides large speedups. I benchmark'd the results of the top 200 shapes in predictor [here](https://www.internalfb.com/phabricator/paste/view/P171537938), as well as verified that it sped up popular models in torchvision. Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D26696703 Pulled By: eellison fbshipit-source-id: 0b4441bee4f6e0890a4540fbca3bb5e58b8c5adf	2021-03-01 21:19:27 -08:00
jiej	4d94ee566e	Ge v1 (#52136 ) Summary: This is a second attempt to use graph executor to run forward on a gradient. This allows a secondary chance to profile intermediate tensor introduced by autodiff. Pull Request resolved: https://github.com/pytorch/pytorch/pull/52136 Reviewed By: pbelevich Differential Revision: D26693978 Pulled By: Krovatkin fbshipit-source-id: 91dde8009a210950af8e5173668ada241e16dd52	2021-02-28 00:53:13 -08:00
Meghan Lele	1d6bd15790	[JIT] Add torch._C._jit submodule (#52910 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52910 Summary PR #52158 tried to move all JIT bindings from `torch._C` to a new submodule `torch._C._jit`, but that...did not go well. This pull request adds the new `torch._C._jit` submodule, but does not migrate the existing bindings. Instead, it adds a unit test that fails if any new bindings are added to `torch._C`. A comment in the test instructs developers to add their new binding to the allowlist if it really should be in `torch._C`, or to add it to the appropriate submodule (e.g `torch._C._jit`, for example). The idea is to prevent the issue described in #51691 from getting worse if it cannot be fixed. Test Plan Continuous integration. Fixes This commit fixes #51691. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D26698373 Pulled By: SplitInfinity fbshipit-source-id: ec9f5426051227a513d4fd09512b624420e0100b	2021-02-26 16:05:05 -08:00
Lillian Johnson	b72a72a477	torch.Package extend PyTorchStreamWriter to track written records (#52218 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52218 Test Plan: Imported from OSS Reviewed By: suo Differential Revision: D26429794 Pulled By: Lilyjjo fbshipit-source-id: 5f68e7991c673ada629d0370c705520243d0637a	2021-02-22 15:02:41 -08:00
Nikolay Korovaiko	847d1d4d53	add debug_flush_compilation_cache to `Method` (#52317 ) Summary: Forgot to add `debug_flush_compilation_cache ` to `Method` as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/52317 Reviewed By: bdhirsh Differential Revision: D26583313 Pulled By: Krovatkin fbshipit-source-id: 1b3e503950cc3314796aff53b3b8038d16767870	2021-02-22 12:31:09 -08:00
Zachary DeVito	60518d10f6	[deploy] torch::deploy API (#51754 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51754 This API allows you to manage multiple python interpreters in a single process to deploy PyTorch models packaged with torch.package. torch/csrc/deploy/deploy.h contains the API definition torch/csrc/deploy/test_deploy.cpp has some examples. Notes: * mutex is added to PyTorchStreamReader to make it safe to use from multiple threads at once. * USE_DEPLOY is only true for the special libtorch_deployinterpreter.so library, when enabled we use a hash table to maintain PyObject <> at::Tensor mappping rather than the internal pointer in Tensor since >1 interpreter may have a reference to the tensor. * serialization.py has some additional functions for creating pickle objects but keeping storages in memory for use transfering tensors between interpreters Test Plan: Imported from OSS Reviewed By: wconstab Differential Revision: D26329468 Pulled By: zdevito fbshipit-source-id: d75f4ebb9a27f1d911179d9996041bcb3ca04a07	2021-02-18 02:30:08 -08:00

1 2 3 4 5 ...

352 Commits