pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Shihao Xu	ae452a81a9	[DistAutograd x JIT] Capture global state, dist autograd current context id, before thread switching triggered by JIT future.wait() (#36395 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36395 titled Test Plan: # Unit tests ``` buck test mode/dev-nosan //caffe2/test/distributed/rpc/jit:dist_autograd_fork -- test_restore_context_after_swtich_to_jit_thread buck build mode/dev-nosan //caffe2/test/distributed/rpc/jit:dist_autograd_fork buck-out/gen/caffe2/test/distributed/rpc/jit/dist_autograd_fork\#binary.par -r test_restore_context_after_swtich_to_jit_thread ``` ``` buck test mode/dev-nosan //caffe2/test/distributed/rpc:dist_autograd_fork -- test_backward_simple_script_call buck build mode/dev-nosan //caffe2/test/distributed/rpc:dist_autograd_fork buck-out/gen/caffe2/test/distributed/rpc/dist_autograd_fork\#binary.par -r test_backward_simple_script_call ``` Differential Revision: D7857991 fbshipit-source-id: 168e0e3846a50ea92d4f9450a30ccc6c13e2fcec	2020-04-11 02:51:39 -07:00
Nikolay Korovaiko	4b916b6b75	Mark every frame with a unique id (#33788 ) Summary: This PR introduces frame ids that will allow us to associate profiling information with its corresponding run. Pull Request resolved: https://github.com/pytorch/pytorch/pull/33788 Differential Revision: D20164897 Pulled By: Krovatkin fbshipit-source-id: 8172ff9f4d188b339e2ff98a80bbe4a2b306a8aa	2020-04-07 17:52:06 -07:00
Jeremy Lilley	72b55fea6b	[jit] Make torch::utils::Future and ivalue::future apis closer (#35849 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35849 This change harmonizes some aspects of the api. - torch::utils::Future callback should have no args, like ivalue::future. Many of the lines of this change are related to fixing that up downstream. No args makes the api simpler to use, particularly since many/most of the downstream use cases ignore the passed-in args. It's simple enough to appropriately capture the future in the lambda if necessary. - Add error/hasError methods to ivalue::Future. - Use c10::optional underneath for error to ivalue::Future. - Change markCompleted(error) to setError(error) to ivalue::Future. - Add setValue(FutureError) version to torch::utils::Future ghstack-source-id: 101684435 Test Plan: buck test mode/dev-nosan caffe2/test/... Differential Revision: D20803251 fbshipit-source-id: e3d925287bd9a80d649843eef5f270163f448269	2020-04-07 17:05:35 -07:00
Christian Sarofeen	6d24f8fe21	Infrastructure for a new CUDA Fuser (#34785 ) Summary: Summary: This PR contains the infrastructure of a new CUDA fuser. This CUDA fuser is based on many of the same principles of TensorExpressions and Halide, however the implementation is ground up. The fusion pass itself is similar to the default CUDA fuser, however, it has undergone some refactoring and is using the new code generation infrastructure. For those who are interested in how the code generation in this PR works, I would recommend reviewing _test/cpp/jit/test_gpu_fusion.cpp_ as well as the long comment section at the beginning of _torch/csrc/jit/codegen/cuda/transform_replay.h_ One of the largest differences between our approach and that of TVM/Halide, is the concept of "TensorView". TensorView from a high level should be thought of similarly to how we think of working with Tensors in PyTorch. It's an N-D object which can undergo transformations that change its dimensionality. Dimensionality changes are done through the operations split/merge/reorder/computeAt. These transformations are similar to split/fuse/reorder/compute_at of TVM, they modify how a tensor is iterated over to generate GPU code. Interestingly, in our scheme these transformations are applied to tensors and only impact how that tensor is generated. Warning: This PR is purposefully not feature complete with the current fuser. We wanted to separate out the infrastructure from the fusion capabilities. Once in, smaller incremental PRs will be submitted to expand capabilities of the fuser. Short term goals: Parity with current CUDA fuser (including performance): - Dynamic shapes (no recompilation) - Implicit handling of braodcast (broadcasted tensors are treated as tensors of the braodcasted size in the generated code) - Dropout Mid-term goals: - Transposes fused with pointwise operations where transpose involves only 2 axes (across the fused operation). - 1-D reductions fused with pointwise operations Pull Request resolved: https://github.com/pytorch/pytorch/pull/34785 Reviewed By: ZolotukhinM Differential Revision: D20650977 Pulled By: soumith fbshipit-source-id: ee39c95a880e1b9822e874ed4cc180971572bf63	2020-04-02 09:22:42 -07:00
Ilia Cherniavskii	a5bfcc5323	Unify management of thread local settings (#35523 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35523 In this PR we extend ThreadLocalState to cover dispatch keys and ThreadLocalDebugInfo and move it from JIT interpreter down to thread management (at::launch) and autograd (backward threads) code Test Plan: unit tests (CI) Reviewed By: dzhulgakov Differential Revision: D20615714 fbshipit-source-id: 16a9fc96a25cb6c2629230b1187fbf78786ac565	2020-04-01 01:56:39 -07:00
Ilia Cherniavskii	800d5617c0	Recording of TorchScript functions (#34710 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34710 Extending RecordFunction API to support new recording scopes (such as TorchScript functions), as well as giving more flexibility to set sampling rate. Test Plan: unit test (test_misc.cpp/testRecordFunction) Reviewed By: gdankel, dzhulgakov Differential Revision: D20158523 fbshipit-source-id: a9e0819d21cc06f4952d92d43246587c36137582	2020-03-31 00:33:23 -07:00
Meghan Lele	6384c2d81b	[JIT] clang-format JIT code (#35115 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35115 This commit runs the newly added tools/clang_format.py on the JIT codebase and includes all of the formatting changes thus produced. Testing: Ran the script, CI. Test Plan: Imported from OSS Reviewed By: eellison Differential Revision: D20568523 Pulled By: SplitInfinity fbshipit-source-id: e09bdb982ccf090eecfb7c7b461b8d0681eef82b	2020-03-26 11:24:51 -07:00
Pritam Damania	7065c46ea2	Respect dist autograd context in torch.jit._fork. (#34360 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34360 The distributed autograd context sets up a thread local context id which is used to perform appropriate book keeping and autograd recording of RPC functions in the forward pass. However, if we use torch.jit._fork within the distributed autograd context, the code executed within torch.jit._fork will lose this context since it is run in a separate JIT thread and the thread local is not set in that thread. To fix this problem, we pass in the distributed autograd context to torch.jit._fork similar to what we did in https://github.com/pytorch/pytorch/pull/16101. ghstack-source-id: 100445465 Test Plan: waitforbuildbot Differential Revision: D20301352 fbshipit-source-id: aa3fffe69c2b40722c66213351a4e0d77484a621	2020-03-19 14:12:28 -07:00
Hong Xu	027d7f7ba5	Delete AT_WARN and replace all AT_WARN with TORCH_WARN (#34623 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34623 The bandaid of "AT_WARN" keeps introducing new warnings. Let's get rid of it entirely. Close #34502 Test Plan: Imported from OSS Differential Revision: D20420112 Pulled By: albanD fbshipit-source-id: 7160c113cb4deb2d2f50a375356f423fe5e86f50	2020-03-13 12:27:22 -07:00
Nikolay Korovaiko	e16908cb1f	profile block outputs; helps guard elimination (#33889 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33889 Reviewed By: zdevito Differential Revision: D20294979 Pulled By: Krovatkin fbshipit-source-id: 2a68710ec8f8f854c99dfe173f49da442a39e498	2020-03-09 17:12:58 -07:00
James Reed	45a504dd2d	[JIT] Introduce BuiltinOpFunction and integrate into torchbind (#34098 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34098 * #33900 [JIT] Move stuff out of class_type.cpp Test Plan: Imported from OSS Differential Revision: D20229166 Pulled By: jamesr66a fbshipit-source-id: d658a63a5d6e372e675f35b8456adc8de82b49f3	2020-03-07 10:03:56 -08:00
James Reed	60e8615a6d	[JIT] Virtualize Function (#33921 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33921 NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.intern.facebook.com/intern/diff/D20153092/)! Test Plan: Imported from OSS Differential Revision: D20177227 Pulled By: jamesr66a fbshipit-source-id: 87f3e484c4f873d60f76f50f6789c1b4a73bdfde	2020-03-07 10:03:50 -08:00
Zachary DeVito	358450e02b	improved TorchScript traceback (#33834 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33834 This changes how we report Tracebacks to make them more clear when there are both serialized and non-serialized ranges. It now looks like: ``` Traceback (most recent call last): File "foo.py", line 25, in <module> s2(a, b) File "/scratch/zdevito/pytorch/torch/nn/modules/module.py", line 550, in __call__ result = self.forward(input, kwargs) RuntimeError: The following operation failed in the TorchScript interpreter. Traceback of TorchScript, serialized code (most recent call last): File "code/__torch__.py", line 7, in forward x: Tensor, y: Tensor) -> Tensor: return (self).bar(x, y, ) ~~~~~~~~~ <--- HERE def bar(self: __torch__.Moo, x: Tensor, File "code/__torch__.py", line 11, in bar x: Tensor, y: Tensor) -> Tensor: _0 = (self).baz(x, y, ) ~~~~~~~~~ <--- HERE _1 = torch.ones([3], dtype=None, layout=None, device=None, pin_memory=None) return torch.add(_0, _1, alpha=1) File "code/__torch__.py", line 17, in baz x: Tensor, y: Tensor) -> Tensor: return torch.add(x, y, alpha=1) ~~~~~~~~~ <--- HERE Traceback of TorchScript, original code (most recent call last): File "foo.py", line 11, in forward def forward(self, x, y): return self.bar(x, y) ~~~~~~~~ <--- HERE File "foo.py", line 9, in bar def bar(self, x, y): return self.baz(x, y) + torch.ones(3) ~~~~~~~~ <--- HERE File "foo.py", line 7, in baz def baz(self, x, y): return x + y ~~~~~ <--- HERE RuntimeError: The size of tensor a (4) must match the size of tensor b (5) at non-singleton dimension 1 ``` It follows Python convension of having the most important information last and reading from the bottom up. Changes: Moved the error message to the end, to copy Python * Report original traceback separate from serialized traceback * Make sure root functions have names in the interpreter trace. Test Plan: Imported from OSS Differential Revision: D20126136 Pulled By: zdevito fbshipit-source-id: fd01f9985e5d74e04c4d064c02e8bc320f4fac13	2020-03-03 12:27:38 -08:00
Michael Suo	dbe850af5b	[jit] do the code reorg (#33851 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33851 Rationale and context described in #33828. Script to reproduce the move: https://gist.github.com/suo/16cbefaaeb67ca5a7c6caffd49b7f6e9 ghstack-source-id: 99079645 Test Plan: Make sure CI passes Reviewed By: jamesr66a Differential Revision: D20133869 fbshipit-source-id: 390e9241a9c85366d9005c492ac31f10aa96488e	2020-02-27 13:02:51 -08:00

14 Commits