pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Han Qi	4eb772fde6	Refactor saving jit::Module to mobile .pt in 2 steps: (#66494 ) Summary: 1. is to convert Function -> mobile::Function 2. is to serialize mobile::Function This also opens opportunity to create mobile::Module without saving/reloading Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/66494 Reviewed By: zhxchen17 Differential Revision: D32293022 Pulled By: qihqi fbshipit-source-id: 29b43d47ff86071d5e2f9d6ca4dba4445711ce3d	2021-11-17 12:02:20 -08:00
jjsjann123	0dc3f829d9	Nvfuser code bump 11 5 (#67943 ) Summary: nvfuser code update: 1. Tuning heuristics on schedulers for reduction/normalization kernels; 2. bfloat16 on IO tensor support; 3. Refactored memory format support, now we can support dimension collapsing with non-coherent input tensors with different memory format. e.g. channels last tensor input to batch normalization. Note that we are currently limiting memory format to only Contiguous and Channels last; 4. Refactored nvfuser graph partitioning in `graph_fuser.cpp`, separated node merge and profile node API. Updated `profiling_record.cpp`. Things that are reverted from our local branch: 1. changes on some entries in autodiff 2. aten::gelu with approximation 3. native_dropout(_backward) Pull Request resolved: https://github.com/pytorch/pytorch/pull/67943 Reviewed By: ngimel Differential Revision: D32288709 Pulled By: dzhulgakov fbshipit-source-id: fc9491182ea7e0158bc112c66f096823c588eaf1	2021-11-17 01:22:17 -08:00
David Berard	5cfca5524c	[JIT] clear GraphFunction.optimized_graphs_ after freezing a module (#68316 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68316 Consider the following: ``` class Mod(nn.Module): def __init__(self, val): super().__init__() self.param = nn.Parameter(val) def forward(self, x): # this method will change during freezing return x + self.param torch.jit.export def make_prediction(self, x): y = x + x return self.forward(y) param = torch.rand([2, 2]) unscripted_mod = Mod(param) mod = torch.jit.script(unscripted_mod) mod.eval() mod = torch.jit.freeze(mod, preserved_attrs=["make_prediction"])` ``` During freezing the following will occur: 1. do some pre-freezing, including inlining; in particular, forward will be inlined into make_prediction. During inlining, forward.optimized_graph() is called, and the result is cached 2. freeze some methods. While freezing forward, the graph associated with the function will get updated. The cached optimized_graphs_ are not updated. Previously, a call to `mod.forward(x)` would return an exectutor that would run on the old cached optimized_graph(). This would mean that the freezing optimizations would not apply, and potentially that the execution would fail because of parameters removed from the module. This change clears the optimized_graphs_ cache after running freezing to prevent executing an old version of the graph. Test Plan: Imported from OSS Reviewed By: eellison Differential Revision: D32410862 Pulled By: davidberard98 fbshipit-source-id: dd8bfe86ec2898b7c72813ab32c08f25c38e4cea	2021-11-16 17:15:29 -08:00
Zhengxu Chen	5ef62c88a9	[jit] Replace get_executor() with call() in abstract Function interface. (#65969 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65969 ghstack-source-id: 141759210 Test Plan: no behavior change. Reviewed By: anjali411 Differential Revision: D31326151 fbshipit-source-id: 201f6dc4c23fdb2531f6b8c73d26127f9e212de4	2021-10-28 13:11:29 -07:00
Zhengxu Chen	0795735351	[jit] Clean up unneeded virtual methods from Function interface. (#65968 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65968 tryToGraphFunction() should cover all cases and more composable than adhoc virtual methods. ghstack-source-id: 141759214 Test Plan: no behavior change. Reviewed By: gmagogsfm Differential Revision: D31326154 fbshipit-source-id: 692a35df424f7d4f777a96489c4cbb24b3ae7807	2021-10-28 12:28:48 -07:00
jjsjann123	1ec732bc46	Add fp16/fp32 autocasting to JIT/TorchScript (#63939 ) Summary: Adds mixed precision autocasting support between fp32/fp16 to torchscript/JIT. More in depth descriptoin can be found at [torch/csrc/jit/JIT-AUTOCAST.md](https://github.com/pytorch/pytorch/pull/63939/files#diff-1f1772aaa508841c5bb58b74ab98f49a1e577612cd9ea5c386c8714a75db830b) This PR implemented an autocast optimization pass that inserts casting ops per AMP rule (torch/csrc/jit/passes/autocast.cpp), that mimics the behavior of eager autocast. The pass also takes into consideration the context of `torch.cuda.amp.autocast` and only inserts casting ops within the enabled context manager, giving feature parity as with eager amp autocast. We currently provide JIT AMP autocast as a prototyping feature, so it is default off and could be turned on via `torch._C._jit_set_autocast_mode(True)` The JIT support for autocast is subject to different constraints compared to the eager mode implementation (mostly related to the fact that TorchScript is statically typed), restriction on the user facing python code is described in doc torch/csrc/jit/JIT-AUTOCAST.md This is a prototype, there are also implementation limitation that's necessary to keep this PR small and get something functioning quickly on upstream, so we can iterate on designs. Few limitation/challenge that is not properly resolved in this PR: 1. Autocast inserts cast operation, which would have impact on scalar type of output tensor feeding downstream operations. We are not currently propagating the updated scalar types, this would give issues/wrong results on operations in promotion rules. 2. Backward for autodiff in JIT misses the casting of dgrad to input scalar type, as what autograd does in eager. This forces us to explicitly mark the casting operation for certain operations (e.g. binary ops), otherwise, we might be feeding dgrad with mismatch scalar type to input. This could potentially break gradient function consuming dgrad. (e.g. gemm backwards, which assumes grad_output to be of same scalar type as input') 3. `torch.autocast` api has an optional argument `dtype` which is not currently supported in the JIT autocast and we require a static value. Credit goes mostly to: tlemo kevinstephano Pull Request resolved: https://github.com/pytorch/pytorch/pull/63939 Reviewed By: navahgar Differential Revision: D31093381 Pulled By: eellison fbshipit-source-id: da6e26c668c38b01e296f304507048d6c1794314	2021-10-27 12:11:36 -07:00
Zhengxu Chen	b55a2500d2	[jit] Remove graph() call from abstract Function interface. (#65967 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65967 Graph is an implementation detail. If user wants to get access to the underlying graph, they should be able to explicitly dynamic cast instead. ghstack-source-id: 141659819 Test Plan: no behavior change. Reviewed By: gmagogsfm Differential Revision: D31326153 fbshipit-source-id: a0e984f57c6013494b92a7095bf5bb660035eb84	2021-10-27 11:54:26 -07:00
Mike Guo	6ecc1a4c4f	Make pytorch clang-tidy clean (#60649 ) Summary: This PR suppresses clang-tidy warnings in the codebase (for now) so that we can re-enable clang-tidy checks on master. I ran this script to add the `NOLINTNEXTLINE` comments (on a devserver): ```bash python3 setup.py develop # Uses same script that's run on CI and adds the -j (parallel), -s (add comments), -k (continue if diagnostic errors are found) options python3 tools/clang_tidy.py \ -j \ -s \ -k \ -v \ --paths torch/csrc/ \ -g"-torch/csrc/jit/passes/onnx/helper.cpp" \ -g"-torch/csrc/jit/passes/onnx/shape_type_inference.cpp" \ -g"-torch/csrc/jit/serialization/onnx.cpp" \ -g"-torch/csrc/jit/serialization/export.cpp" \ -g"-torch/csrc/jit/serialization/import.cpp" \ -g"-torch/csrc/jit/serialization/import_legacy.cpp" \ -g"-torch/csrc/onnx/init.cpp" \ -g"-torch/csrc/cuda/nccl." \ -g"-torch/csrc/cuda/python_nccl.cpp" \ -g"-torch/csrc/autograd/FunctionsManual.cpp" \ -g"-torch/csrc/generic/.cpp" \ -g"-torch/csrc/jit/codegen/cuda/runtime/*" \ -g"-torch/csrc/deploy/interpreter/interpreter.cpp" \ -g"-torch/csrc/deploy/interpreter/interpreter.h" \ -g"-torch/csrc/deploy/interpreter/interpreter_impl.h" \ -g"-torch/csrc/deploy/interpreter/test_main.cpp" ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/60649 Test Plan: Verified changes by re-running the script (without the `-s` option) and seeing no warnings/errors. Reviewed By: walterddr, janeyx99 Differential Revision: D29504258 Pulled By: 1ntEgr8 fbshipit-source-id: 78310b30ee8213b73ddb4771ad874665323e7a4e	2021-07-01 12:21:07 -07:00
Gaoxiang Liu	735f8cc6c2	[DI] Allow explicit taskLauncher for torchscript interpreter (#46865 ) Summary: By default, TorchScript execution is single threaded and uses the caller's thread pool. For the use case of distributed inference, we hope there is a way to customize the behavior where the interpreter in torch script can be executed in other places. This diff allows an explicit taskLauncher for torchscript interpreter. Pull Request resolved: https://github.com/pytorch/pytorch/pull/46865 Test Plan: unit test is passed. fbshipit-source-id: 1d7b003926c0d1f8facc53206efb960cff8897ac Fixes #{issue number} Reviewed By: houseroad Differential Revision: D24616102 Pulled By: garroud fbshipit-source-id: 79202b62f92d0b0baf72e4bf7aa3f05e0da91d59	2020-11-04 17:07:55 -08:00
Ansha Yu	aac36a89ff	[model transform] tuple to arglist jit pass (#36093 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36093 Unwrap any tuples (including NamedTuples) in the module forward function input list to be arglist. 1. Supports multiple tuple inputs, and traces their use through CallMethods and TupleIndex 2. Does not unwrap inner use of other tuples that did not show up in the original toplevel graph inputs We work from the ScriptModule level instead of the Graph level because: 1. If the ScriptModule was previously called with the original set of inputs, the GraphExecutor caches the ExecutionPlan (specifically, ArgumentSpecCreator is derived from the Graph and type check the inputs passed in) 2. Since we are changing this graph's inputs, we clone the module and clear the GraphExecutor. Since we work from ScriptModule level, we cannot take advantage of jit level syntactic sugar like run_pass(), so I jit exposed this as a cpp extension. Let me know if there are other ideas about this. Test Plan: buck test caffe2/torch/fb/model_transform:signature_translation_test Todo: Verify use in bento Untranslated graph: ``` > graph(%self : __torch__.test_jit.SparseNNWrapper, > %inputs.1 : NamedTuple(dense : Tensor, sparse : Dict(int, Tensor))): > %2 : __torch__.test_jit.SparseNN = prim::GetAttr[name="main_module"](%self) > %4 : Tensor = prim::CallMethod[name="forward"](%2, %inputs.1) # /data/users/ansha/fbsource/fbcode/buck-out/dev/gen/caffe2/test/jit#binary,link-tree/test_jit.py:12141:23 > return (%4) ``` Translated graph: ``` > graph(%self : __torch__.test_jit.___torch_mangle_1.SparseNNWrapper, > %inputs.1_0 : Tensor, > %inputs.1_1 : Dict(int, Tensor)): > %2 : __torch__.test_jit.___torch_mangle_2.SparseNN = prim::GetAttr[name="main_module"](%self) > %3 : Tensor = prim::CallMethod[name="forward"](%2, %inputs.1_0, %inputs.1_1) # /data/users/ansha/fbsource/fbcode/buck-out/dev/gen/caffe2/test/jit#binary,link-tree/test_jit.py:12141:23 > return (%3) ``` Reviewed By: houseroad Differential Revision: D20313673 fbshipit-source-id: fddd07c9537dc8b6f480a14d697bea10ecc74470	2020-04-09 22:05:43 -07:00
Jeremy Lilley	8d64a3848c	[jit] In RPC Server, handle TorchScript continuations asynchronously (#34109 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34109 This change adds glue to GraphExecutor to give the RPC server access to the future-based Interpreter::runAsync() api. Previously, if a server encounted a TorchScript continuation-based block with fork/wait, it would simply block in the server thread until the handler completed, since it uses the synchronous Interpreter::run() api. With the ivalue::Future returned by the Interpreter, we can run the TorchScript code asynchronously from c++ simply by connecting its callback to the server callback. We add test cases to cover the new logic, both rpc_async and remote. ghstack-source-id: 101245438 Test Plan: buck test mode/dev-nosan caffe2/test/distributed/rpc/... Differential Revision: D20194321 fbshipit-source-id: 16785ec5d9ed0b16cb1ffab0a9771a77de30fcb0	2020-03-31 17:21:46 -07:00
Ilia Cherniavskii	800d5617c0	Recording of TorchScript functions (#34710 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34710 Extending RecordFunction API to support new recording scopes (such as TorchScript functions), as well as giving more flexibility to set sampling rate. Test Plan: unit test (test_misc.cpp/testRecordFunction) Reviewed By: gdankel, dzhulgakov Differential Revision: D20158523 fbshipit-source-id: a9e0819d21cc06f4952d92d43246587c36137582	2020-03-31 00:33:23 -07:00
Hong Xu	027d7f7ba5	Delete AT_WARN and replace all AT_WARN with TORCH_WARN (#34623 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34623 The bandaid of "AT_WARN" keeps introducing new warnings. Let's get rid of it entirely. Close #34502 Test Plan: Imported from OSS Differential Revision: D20420112 Pulled By: albanD fbshipit-source-id: 7160c113cb4deb2d2f50a375356f423fe5e86f50	2020-03-13 12:27:22 -07:00
James Reed	45a504dd2d	[JIT] Introduce BuiltinOpFunction and integrate into torchbind (#34098 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34098 * #33900 [JIT] Move stuff out of class_type.cpp Test Plan: Imported from OSS Differential Revision: D20229166 Pulled By: jamesr66a fbshipit-source-id: d658a63a5d6e372e675f35b8456adc8de82b49f3	2020-03-07 10:03:56 -08:00
James Reed	60e8615a6d	[JIT] Virtualize Function (#33921 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33921 NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.intern.facebook.com/intern/diff/D20153092/)! Test Plan: Imported from OSS Differential Revision: D20177227 Pulled By: jamesr66a fbshipit-source-id: 87f3e484c4f873d60f76f50f6789c1b4a73bdfde	2020-03-07 10:03:50 -08:00

15 Commits