pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Louis Feng	eb75cfb9c0	Back out "Revert D23323486: DPP Async Tracing" plus windows build fix. (#44702 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44702 Original commit changeset: c6bd6d277aca This diff caused windows build to fail due to a compiler bug in VS2019 (lambda capture constant int value). This back out works around the issue with explicit capture of const int value. Test Plan: Tested and previously landed. Reviewed By: mruberry Differential Revision: D23703215 fbshipit-source-id: f9ef23be97540bc9cf78a855295fb8c69f360459	2020-09-16 11:32:11 -07:00
Mike Ruberry	7036e91abd	Revert D23323486: DPP Async Tracing Test Plan: revert-hammer Differential Revision: D23323486 (`71673b31f9`) Original commit changeset: 4b6ca6c0e320 fbshipit-source-id: c6bd6d277aca070bef2de3522c2a60e23b4395ad	2020-09-15 01:19:23 -07:00
Louis Feng	71673b31f9	DPP Async Tracing (#44252 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44252 Add tracing to DPP client. Because DPP requests are async, we need to be able to start a trace event in one thread and potentially end in a different thread. RecordFunction and LibgpumonObserver previously assume each trace event starts and finishes in the same thread. So they use a thread local context to track enter and exit call backs. Async events breaks this assumption. This change attaches the event context to the RecordFunction object so we do not need to use thread local context. Test Plan: Tested with dpp perf test and able to collect trace. {F307824044} Reviewed By: ilia-cher Differential Revision: D23323486 fbshipit-source-id: 4b6ca6c0e32028fb38a476cd1f44c17a001fc03b	2020-09-14 18:43:14 -07:00
Nikolay Korovaiko	f91bdbeabd	Enable function calls in TEFuser and SpecializeAutogradZero (#43866 ) Summary: Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/43866 Reviewed By: ezyang Differential Revision: D23452798 Pulled By: Krovatkin fbshipit-source-id: 2cff4c905bf1b5d9de56e7869458ffa6fce1f1b5	2020-09-03 14:42:52 -07:00
Pritam Damania	f1624b82b5	Preserve python backtrace in autograd engine errors. (#43684 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43684 This PR attempts to address #42560 by capturing the appropriate exception_ptr in the autograd engine and passing it over to the Future. As part of this change, there is a significant change the Future API where we now only accept an exception_ptr as part of setError. For the example in #42560, the exception trace would now look like: ``` > Traceback (most recent call last): > File "test_autograd.py", line 6914, in test_preserve_backtrace > Foo.apply(t).sum().backward() > File "torch/tensor.py", line 214, in backward > torch.autograd.backward(self, gradient, retain_graph, create_graph) > File "torch/autograd/__init__.py", line 127, in backward > allow_unreachable=True) # allow_unreachable flag > File "torch/autograd/function.py", line 87, in apply > return self._forward_cls.backward(self, *args) > File "test_autograd.py", line 6910, in backward > raise ValueError("something") > ValueError: something ``` ghstack-source-id: 111109637 Test Plan: waitforbuildbot Reviewed By: albanD Differential Revision: D23365408 fbshipit-source-id: 1470c4776ec8053ea92a6ee1663460a3bae6edc5	2020-09-01 01:28:47 -07:00
Nikolay Korovaiko	000739c31a	Function calls for fallback paths (#43274 ) Summary: This PR adds API to package unoptimized/fallback blocks as function calls. It's mainly meant to be used by TensorExpressionsFuser and SpecializeAutogradZero passes as both specialize the original graph but would also like to provide a fallback path in case the assumptions under which the graph was specialized do not hold for some inputs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/43274 Reviewed By: malfet Differential Revision: D23406961 Pulled By: Krovatkin fbshipit-source-id: ef21fc9ad886953461b09418d02c75c58375490c	2020-08-28 23:31:02 -07:00
Nikolay Korovaiko	a97ca93c0e	remove prim::profile and special-casing (#43160 ) Summary: Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/43160 Reviewed By: ZolotukhinM Differential Revision: D23284421 Pulled By: Krovatkin fbshipit-source-id: 35e97aad299509a682ae7e95d7cef53301625309	2020-08-22 23:52:36 -07:00
Elias Ellison	91f3114fc1	[JIT] Represent profiled types as a node attribute (#43035 ) Summary: This changes profiled types from being represented as: `%23 : Float(4:256, 256:1, requires_grad=0, device=cpu) = prim::profile(%0)` -> `%23 : Tensor = prim::profile[profiled_type=Float(4:256, 256:1, requires_grad=0, device=cpu)](%0)` Previously, by representing the profiled type in the IR directly it was very easy for optimizations to accidentally use profiled types without inserting the proper guards that would ensure that the specialized type would be seen. It would be a nice follow up to extend this to prim::Guard as well, however we have short term plans to get rid of prim::Guard. Pull Request resolved: https://github.com/pytorch/pytorch/pull/43035 Reviewed By: ZolotukhinM Differential Revision: D23120226 Pulled By: eellison fbshipit-source-id: c78d7904edf314dd65d1a343f2c3a947cb721b32	2020-08-14 20:17:46 -07:00
Ilia Cherniavskii	a53fdaa23f	Remove ProfiledType (#42570 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42570 ProfiledType doesn't do anything and is not used atm, removing Test Plan: CI Reviewed By: ezyang Differential Revision: D22938664 Pulled By: ilia-cher fbshipit-source-id: 037c512938028f44258b702bbcde3f8c144f4aa0	2020-08-06 01:52:08 -07:00
Ilia Cherniavskii	e7a09b4d17	RecordFunction in Dispatcher (#37587 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37587 Lifting RecordFunction up into the dispatcher code Test Plan: Imported from OSS Differential Revision: D21374246 fbshipit-source-id: 19f9c1719e6fd3990e451c5bbd771121e91128f7	2020-07-17 22:20:05 -07:00
Rohan Varma	bf9cc5c776	Add callback with TLS state API in futures (#40326 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40326 Adds a helper function `addCallbackWithTLSState` to both torch/csrc/utils/future.h which is used internally by RPC framework and the JIT future. Uses this helper function to avoid to pass in TLS state where it is needed for rpc and `record_function_ops.cpp`. For example, the following: ``` at::ThreadLocalState tls_state; fut->addCallback([tls_state = std::move(tls_state)]() { at::ThreadLocalStateGuard g(tls_state); some_cb_that_requires_tls_state(); } ``` becomes ``` fut->addCallbackWithTLSState(some_cb_that_requires_tls_state); ``` ghstack-source-id: 107383961 Test Plan: RPC Tests and added a test in test_misc.cpp Differential Revision: D22147634 fbshipit-source-id: 46c02337b90ee58ca5a0861e932413c40d06ed4c	2020-07-08 23:25:35 -07:00
Sebastian Messmer	53af9df557	Unify boxed function signature between jit and c10 (#37034 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37034 c10 takes a Stack* in boxed functions while JIT took Stack&. c10 doesn't return anything while JIT returns an int which is always zero. This changes JIT to follow the c10 behavior. ghstack-source-id: 106834069 Test Plan: unit tests Differential Revision: D20567950 fbshipit-source-id: 1a7aea291023afc52ae706957e9a5ca576fbb53b	2020-06-29 19:24:26 -07:00
Jeremy Lilley	569c85b45d	[futures] Add assert to Future constValue() accessor, add hasValue(). (#39950 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39950 Per the comment in the code, constValue() should only be used in the case where the future was complete and value was not an error. Add an assert to enforce this. Also, add hasValue() accessor for completeness. ghstack-source-id: 105815597 Test Plan: buck test mode/dev-nosan caffe2/test/cpp/jit: Differential Revision: D22021776 fbshipit-source-id: b59b6c775eab344068a76f4cd8c3a9dc1f2a174e	2020-06-15 12:11:22 -07:00
Nikolay Korovaiko	7f55197a57	Peel Loop (#39434 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39434 Differential Revision: D21857037 Pulled By: Krovatkin fbshipit-source-id: 6583da167fe93d96e93f1c3d71f46f94e7f4e982	2020-06-10 13:48:18 -07:00
Jeremy Lilley	be3bbfc917	[futures] Add collectAny() to ivalue::Future for completeness (#39597 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39597 To complement collectAll(), this change adds collectAny(), and writes up relevant unittest coverage. We also remove the vector-based helper version of collectAll(), which was debatable usefulness in a previous change. ghstack-source-id: 105527180 Test Plan: buck test mode/dev-nosan caffe2/test/cpp/jit/... Differential Revision: D21910311 fbshipit-source-id: dbb3ca404672a3d751b1b3cf016e6084a9ff8040	2020-06-09 16:32:52 -07:00
Jeremy Lilley	b83fed8d4c	[futures] Add c++ ivalue::Future collectAll() helper (#39119 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39119 Add some base c++ unittest coverage for ivalue::Future, and in the process, add a basic collectAll() primitive, per 38937. In the process, I realized that List<Future> is effectively impossible to construct (since the Future's type is not templated, but rather passed in, the getTypePtr_<T>::call() isn't defined), so added a workaround in List to make it possible. ghstack-source-id: 105309650 Test Plan: buck test mode/dev-nosan caffe2/test/cpp/jit/... Differential Revision: D21756884 fbshipit-source-id: 5d40c8d1c55098de5497655c7b887f4f56508a37	2020-06-08 05:52:09 -07:00
Linbin Yu	b28422d444	add overload name for str cmp (#39607 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39607 add overload name for strcmp macro to prevent duplicated op names in lite interpreter also reformatted some other files Test Plan: verified these op schema are changed ``` -aten::eq(str a, str b) -> (bool) +aten::eq.str(str a, str b) -> (bool) -aten::ne(str a, str b) -> (bool) +aten::ne.str(str a, str b) -> (bool) -aten::lt(str a, str b) -> (bool) +aten::lt.str(str a, str b) -> (bool) -aten::gt(str a, str b) -> (bool) +aten::gt.str(str a, str b) -> (bool) -aten::le(str a, str b) -> (bool) +aten::le.str(str a, str b) -> (bool) -aten::ge(str a, str b) -> (bool) +aten::ge.str(str a, str b) -> (bool) ``` Reviewed By: iseeyuan Differential Revision: D21913049 fbshipit-source-id: 518db068c8c5b0efd19223f0bd94fc3351335dc4	2020-06-06 23:21:35 -07:00
Nikolay Korovaiko	97a2918a07	reduce number of bailout nodes (#38281 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38281 Differential Revision: D21665509 Pulled By: Krovatkin fbshipit-source-id: c2c34b759aec30d0a161e582030ba994192ee4ec	2020-06-05 13:45:37 -07:00
Ilia Cherniavskii	abe2be2063	[resubmit] Use TensorMethods.cpp (#39385 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39385 see https://github.com/pytorch/pytorch/pull/37639 Test Plan: https://github.com/pytorch/pytorch/pull/37639 Imported from OSS Differential Revision: D21833287 fbshipit-source-id: 9928d3f4122903d0de67ad312e349352d5f5c19c	2020-06-02 20:27:51 -07:00
Edward Yang	2fe0fc2684	Revert D21374247: Use TensorMethods.cpp Test Plan: revert-hammer Differential Revision: D21374247 Original commit changeset: 076964415079 fbshipit-source-id: 732ec8c561d1f37475c1b5549ba79c718e3a6db8	2020-06-01 08:12:09 -07:00
Ilia Cherniavskii	68e62b9ab6	Use TensorMethods.cpp (#37639 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37639 Changing TensorMethods.h to .cpp Necessary to avoid incomplete types in dispatcher Test Plan: CI Imported from OSS checked mobile size, no change, small reduction in size in fbios fbios: Succeeded Change in Download Size for arm64 + 3x assets variation: -18.2 KiB Change in Uncompressed Size for arm64 + 3x assets variation: -8.8 KiB reran benchmark, no stat. significant difference buck run mode/opt caffe2/caffe2/fb/high_perf_models/pytorch/benchmark_framework_overheads:benchmark_torchscript_model -- --model_file caffe2/caffe2/fb/high_perf_models/pytorch/benchmark_framework_overheads/addmodule.pt --num_runs 3 ╷ @ 68592d0d 41 minutes ago iliacher D21374247 ╭─╯ Use TensorMethods.cpp Created 3 benchmark runs on aibench for caffe2/caffe2/fb/high_perf_models/pytorch/benchmark_framework_overheads/addmodule.pt. Links to the results: * Adhoc run: https://our.intern.facebook.com/intern/aibench/details/1729113760 * Adhoc run: https://our.intern.facebook.com/intern/aibench/details/3867976782 * Adhoc run: https://our.intern.facebook.com/intern/aibench/details/2782186766 hg prev @ 7f501b42 Thursday at 14:26 bvaughan D21764704 ╷ short-circuit pow for complex 1 and 0 exponents Created 3 benchmark runs on aibench for caffe2/caffe2/fb/high_perf_models/pytorch/benchmark_framework_overheads/addmodule.pt. Links to the results: * Adhoc run: https://our.intern.facebook.com/intern/aibench/details/2155256332 * Adhoc run: https://our.intern.facebook.com/intern/aibench/details/1802057074 * Adhoc run: https://our.intern.facebook.com/intern/aibench/details/4119590830 Differential Revision: D21374247 fbshipit-source-id: 076964415079cf84fb57f1f7b43d087afed86e1d	2020-05-31 17:11:12 -07:00
Ilia Cherniavskii	a5e023f28a	Set RecordFunction id only when needed (#39265 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39265 In this PR we set id of RecordFunction only when callbacks need them and when there's at least one active callback Test Plan: testRecordFunction unit test in test_misc.cpp buck test mode/dev caffe2/test/cpp/jit:jit https://our.intern.facebook.com/intern/testinfra/testrun/8725724291116413 Reviewed By: dzhulgakov Differential Revision: D21790421 fbshipit-source-id: 016623d7f1a2a271921a71c0483061e232b40321	2020-05-29 15:34:44 -07:00
Luca Antiga	e088902b4a	Add type-hint check for default arguments in TorchScript C++ frontend (#39021 ) Summary: This PR fixes https://github.com/pytorch/pytorch/issues/39020 by requiring users to type-hint default arguments to a TorchScript when using the C++ frontend (the Python frontend will insert those automatically). Since this is a bit of a niche use case, I opted for the simpler solution of making type-hints mandatory for default arguments, as opposed to trying to type-infer them. I left a comment in the code justifying this choice. Test is included. /cc t-vi Pull Request resolved: https://github.com/pytorch/pytorch/pull/39021 Differential Revision: D21755317 Pulled By: suo fbshipit-source-id: e007650d3bfb3a4c58c25ad2c3a17759898f303b	2020-05-28 01:42:04 -07:00
Nikita Shulga	c6e9e9359f	[Codemod][GleanFbcode] Remove dead includes in caffe2/test (#39023 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39023 Reviewed By: orionr Differential Revision: D21702529 fbshipit-source-id: 6945bba95609102409850b105a8a091e33b8acc9	2020-05-27 14:07:26 -07:00
Ilia Cherniavskii	43dd8760d7	Move ThreadLocalDebugInfo to c10 (#37774 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37774 Move ThreadLocalDebugInfo from ATen to C10 Test Plan: Imported from OSS Differential Revision: D21384249 Pulled By: ilia-cher fbshipit-source-id: f9b5089a868f84a2ee013695a481fcc883d3c6b2	2020-05-11 19:27:41 -07:00
Ilia Cherniavskii	facc5e0cc4	Make profiler thread local (#36291 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36291 Move profiler state to be a thread local property, reuse existing thread local propagation mechanism to ensure correct profiling of async tasks. This also makes push/pop callback thread safe and easier to use in e.g. distributed profilier Test Plan: USE_BLAS=MKL USE_MKLDNN=0 USE_CUDA=0 python setup.py develop install ./build/bin/test_jit ./build/bin/test_jit python test/test_autograd.py python test/test_jit.py Differential Revision: D20938501 Pulled By: ilia-cher fbshipit-source-id: c0c6c3eddcfea8fc7c14229534b7246a0ad25845	2020-05-07 14:52:49 -07:00
Ilia Cherniavskii	2ef4010593	Propagate TLS callbacks with ThreadLocalState (#37745 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37745 This PR makes it possible to set TLS callbacks and use them transparently not only in the main thread but also in any async tasks Test Plan: Imported from OSS Differential Revision: D21374873 Pulled By: ilia-cher fbshipit-source-id: 3be2e121673b32d7694e17e794f3b474826dffe9	2020-05-07 14:52:44 -07:00
Ilia Cherniavskii	2d708cefcc	Move RecordFunction into ATen (#37548 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37548 Moving RecordFunction from torch::autograd::profiler into at namespace Test Plan: CI Imported from OSS Differential Revision: D21315852 fbshipit-source-id: 4a4dbabf116c162f9aef0da8606590ec3f3847aa	2020-05-07 14:52:39 -07:00
Ilia Cherniavskii	c24c5f9684	Make RecordFunction callbacks thread local and modernize interface (#37491 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37491 This PR modernizes RecordFunction API and adds thread local callbacks in addition to the global ones Changes: - support for TLS callbacks, this is going to be the foundation of profiler and other tools - modernize interface around simple set of functions (add\|remove\|has\|clear)(Global\|ThreadLocal)(Callback) and adding RecordFunctionCallback to easily construct callbacks to be passed - we also add `.setShouldRun` into the callback interface to support cases when simple uniform sampling is not enough - to properly support add/remove introduce the idea of callback handle returned by add - internal implementation still uses SmallVector to store intermediate state (as before) - in this case these are vector of handles of callbacks that were picked to run - to speed up runtime we keep these vectors sorted, this way we can quickly enumerate callbacks that need to be run - added tests for new functionality Test Plan: BUILD_BINARY=1 USE_BLAS=MKL USE_MKLDNN=0 USE_CUDA=0 python setup.py develop install ./build/bin/test_jit CI record_function_benchmark: https://gist.github.com/ilia-cher/f1e094dae47fe23e55e7672ac4dcda2f Imported from OSS Differential Revision: D21300448 fbshipit-source-id: 6d55c26dbf20b33d35c3f1604dcc07bb063c8c43	2020-05-07 14:51:02 -07:00
Ilia Cherniavskii	d068a456d3	[resubmit] Enable global observers API (#37382 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37382 After adding c10::DispatchKey::Profiler the behavior of RecordFunction observers is also controlled by the dispatch key, this PR moves the logic outside of the profiler into the record function Reviewed By: jamesr66a Differential Revision: D21268320 fbshipit-source-id: 93207e3b55325d20dcc5b1e8f448ab86933321da	2020-04-28 10:49:31 -07:00
James Reed	1592d6842c	[resubmit] Move profiler to a dispatch wrapper (#36766 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36766 Original commit changeset: dcb41d243369 ghstack-source-id: 102614215 Test Plan: waitforsadcastle Differential Revision: D21076029 fbshipit-source-id: c2461c57cfd364bd23ff99bc2cb5572d22e23391	2020-04-21 16:37:11 -07:00
Ilia Cherniavskii	3ae70cb847	Add RecordFunctionGuard (#36215 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36215 Make it possible to disable observers, e.g. to avoid infinite recursion if an observer uses an operator Test Plan: USE_BLAS=MKL USE_MKLDNN=0 USE_CUDA=0 python setup.py develop install ./build/bin/test_jit Differential Revision: D20912676 Pulled By: ilia-cher fbshipit-source-id: 29760cdfe488a02f943f755967b78779d6dbcef3	2020-04-20 19:19:14 -07:00
Karl Ostmo	4894cba572	Revert D19775659: [WIP] Move profiler to a dispatch wrapper Test Plan: revert-hammer Differential Revision: D19775659 Original commit changeset: 5cbe5f736660 fbshipit-source-id: dcb41d2433697c5d521044a9dbc12c79f31e0929	2020-04-16 14:18:51 -07:00
James Reed	a85c835196	[WIP] Move profiler to a dispatch wrapper (#33057 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33057 Test Plan: Imported from OSS Differential Revision: D19775659 Pulled By: jamesr66a fbshipit-source-id: 5cbe5f736660c8543764ef62b16550638d9ceb72	2020-04-16 13:36:37 -07:00
Ilia Cherniavskii	bc6bd0bb1a	Debug Information Guard Summary: This diff fixes the issues with current handling of debug information passed along the execution of the model. (For example, it is possible that multiple calls to the debug guard may override each other) Test Plan: CI test/cpp/jit Reviewed By: dzhulgakov Differential Revision: D20602775 fbshipit-source-id: 4683957954028af81a1a0f1f12b243650230c9bb	2020-04-01 01:55:29 -07:00
Ilia Cherniavskii	800d5617c0	Recording of TorchScript functions (#34710 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34710 Extending RecordFunction API to support new recording scopes (such as TorchScript functions), as well as giving more flexibility to set sampling rate. Test Plan: unit test (test_misc.cpp/testRecordFunction) Reviewed By: gdankel, dzhulgakov Differential Revision: D20158523 fbshipit-source-id: a9e0819d21cc06f4952d92d43246587c36137582	2020-03-31 00:33:23 -07:00
Meghan Lele	6384c2d81b	[JIT] clang-format JIT code (#35115 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35115 This commit runs the newly added tools/clang_format.py on the JIT codebase and includes all of the formatting changes thus produced. Testing: Ran the script, CI. Test Plan: Imported from OSS Reviewed By: eellison Differential Revision: D20568523 Pulled By: SplitInfinity fbshipit-source-id: e09bdb982ccf090eecfb7c7b461b8d0681eef82b	2020-03-26 11:24:51 -07:00
Michael Suo	c235be42dd	[jit] kill script namespace (#34515 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34515 Once upon a time we thought this was necessary. In reality it is not, so removing it. For backcompat, our public interface (defined in `api/`) still has typedefs to the old `script::` names. There was only one collision: `Pass` as a `Stmt` and `Pass` as a graph transform. I renamed one of them. Test Plan: Imported from OSS Differential Revision: D20353503 Pulled By: suo fbshipit-source-id: 48bb911ce75120a8c9e0c6fb65262ef775dfba93	2020-03-11 23:32:48 -07:00
Edward Yang	cf8b728255	Delete OperatorOptions, absorb AliasAnalysisKind into FunctionSchema. (#34588 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34588 I constructed the patch by deleting OperatorOptions and then rerouting all queries for AliasAnalysisKind to FunctionSchema. Some of the behavior is kind of bogus: we really shouldn't be mutating FunctionSchema after the fact, but that won't get fixed until we actually switch to true schema merging. Reland of https://github.com/pytorch/pytorch/pull/34160 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D20387079 Pulled By: ezyang fbshipit-source-id: d189f7a6ad8cd186b88b6fbfa3f189994eea14e8	2020-03-11 20:59:46 -07:00
Edward Yang	6f8a8e4e47	Revert D20282846: Delete OperatorOptions, absorb AliasAnalysisKind into FunctionSchema. Test Plan: revert-hammer Differential Revision: D20282846 Original commit changeset: ba7bca6e8adc fbshipit-source-id: b9e15d2b2c3d1dbc6e971ab3c0bdf380e769dcf1	2020-03-11 07:50:29 -07:00
Edward Yang	9d42177a31	Delete OperatorOptions, absorb AliasAnalysisKind into FunctionSchema. (#34160 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34160 I constructed the patch by deleting OperatorOptions and then rerouting all queries for AliasAnalysisKind to FunctionSchema. Some of the behavior is kind of bogus: we really shouldn't be mutating FunctionSchema after the fact, but that won't get fixed until we actually switch to true schema merging. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D20282846 Pulled By: ezyang fbshipit-source-id: ba7bca6e8adc3365789639b88e54c4e881b1692e	2020-03-11 07:15:18 -07:00
Ilia Cherniavskii	b50825e011	Make RecordFunction more robust for async use cases (#34122 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34122 Earlier work added support for async rpc cases when RecordFunction's end callbacks might be called in a different thread; in addition some extra care was needed to handle pointer to parent function; This PR makes RecordFunction aware of potentially multiple threads in use, as well as removes unused parent() call and restricts current() RecordFunction to scope-based record functions (RECORD_FUNCTION macro) Test Plan: unit tests Differential Revision: D20297709 Pulled By: ilia-cher fbshipit-source-id: 46a59e1b2eea0bbd8a59630385e193b38d30f9d1	2020-03-05 22:28:53 -08:00
Zachary DeVito	358450e02b	improved TorchScript traceback (#33834 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33834 This changes how we report Tracebacks to make them more clear when there are both serialized and non-serialized ranges. It now looks like: ``` Traceback (most recent call last): File "foo.py", line 25, in <module> s2(a, b) File "/scratch/zdevito/pytorch/torch/nn/modules/module.py", line 550, in __call__ result = self.forward(input, kwargs) RuntimeError: The following operation failed in the TorchScript interpreter. Traceback of TorchScript, serialized code (most recent call last): File "code/__torch__.py", line 7, in forward x: Tensor, y: Tensor) -> Tensor: return (self).bar(x, y, ) ~~~~~~~~~ <--- HERE def bar(self: __torch__.Moo, x: Tensor, File "code/__torch__.py", line 11, in bar x: Tensor, y: Tensor) -> Tensor: _0 = (self).baz(x, y, ) ~~~~~~~~~ <--- HERE _1 = torch.ones([3], dtype=None, layout=None, device=None, pin_memory=None) return torch.add(_0, _1, alpha=1) File "code/__torch__.py", line 17, in baz x: Tensor, y: Tensor) -> Tensor: return torch.add(x, y, alpha=1) ~~~~~~~~~ <--- HERE Traceback of TorchScript, original code (most recent call last): File "foo.py", line 11, in forward def forward(self, x, y): return self.bar(x, y) ~~~~~~~~ <--- HERE File "foo.py", line 9, in bar def bar(self, x, y): return self.baz(x, y) + torch.ones(3) ~~~~~~~~ <--- HERE File "foo.py", line 7, in baz def baz(self, x, y): return x + y ~~~~~ <--- HERE RuntimeError: The size of tensor a (4) must match the size of tensor b (5) at non-singleton dimension 1 ``` It follows Python convension of having the most important information last and reading from the bottom up. Changes: Moved the error message to the end, to copy Python * Report original traceback separate from serialized traceback * Make sure root functions have names in the interpreter trace. Test Plan: Imported from OSS Differential Revision: D20126136 Pulled By: zdevito fbshipit-source-id: fd01f9985e5d74e04c4d064c02e8bc320f4fac13	2020-03-03 12:27:38 -08:00
Michael Suo	dbe850af5b	[jit] do the code reorg (#33851 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33851 Rationale and context described in #33828. Script to reproduce the move: https://gist.github.com/suo/16cbefaaeb67ca5a7c6caffd49b7f6e9 ghstack-source-id: 99079645 Test Plan: Make sure CI passes Reviewed By: jamesr66a Differential Revision: D20133869 fbshipit-source-id: 390e9241a9c85366d9005c492ac31f10aa96488e	2020-02-27 13:02:51 -08:00
Mikhail Zolotukhin	806e7daa1f	Rename TorchScript compiler to IR emitter to better reflect its function. (#33127 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33127 Test Plan: Imported from OSS Differential Revision: D19806503 Pulled By: ZolotukhinM fbshipit-source-id: ab78bdbbac5f12dbcc6c2e2573f5862a16ffcf3d	2020-02-12 18:45:13 -08:00
Zachary DeVito	99349defc1	remove unnecessary Node* ops (#32760 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32760 Minor changes to the way ops are implemented to remove incidental use of Node* in the operator implementation. Current state for operators that previously took Node: ``` TBD: USES NODE: prim::DifferentiableGraph(...) -> (...) USES NODE: prim::profile(...) -> (...) USES NODE: prim::FusionGroup(...) -> (...) USES NODE: prim::PythonOp(...) -> (...) USES NODE: prim::ImplicitTensorToNum(Tensor a) -> Scalar # next PR Should be made interpreter primitives: USES NODE: prim::TupleUnpack(...) -> (...) USES NODE: prim::TupleSlice(...) -> (...) USES NODE: prim::TupleConstruct(...) -> (...) USES NODE: prim::ListUnpack(...) -> (...) USES NODE: prim::ListConstruct(...) -> (...) USES NODE: prim::DictConstruct(...) -> (...) USES NODE: prim::Constant() -> (...) USES NODE: prim::isinstance(...) -> (...) USES NODE: prim::CreateObject(...) -> (...) USES NODE: prim::fork(...) -> (...) USES NODE: aten::warn(str message, , int stacklevel=2) -> () # need stack level information, so ideally in interpreter so it can look at the stack Should be made into vararg operators, i.e. the operators last argument should be an IValue that contains the number of arguments. USES NODE: prim::FusedConcat(...) -> (...) USES NODE: prim::MMTreeReduce(...) -> (...) USES NODE: prim::MMBatchSide(...) -> (...) USES NODE: prim::ConstantChunk(...) -> (...) USES NODE: prim::AutogradAnyNonZero(...) -> bool USES NODE: prim::BroadcastSizes(...) -> (...) USES NODE: prim::ChunkSizes(...) -> (...) USES NODE: aten::format(str self, ...) -> str USES NODE: prim::Print(...) -> (...) fixed: USES NODE: aten::extend(Tensor[](a!) self, Tensor [] other) -> () USES NODE: aten::copy(Tensor[](a) self) -> Tensor[] USES NODE: aten::extend(int[](a!) self, int [] other) -> () USES NODE: aten::copy(int[](a) self) -> int[] USES NODE: aten::extend(float[](a!) self, float [] other) -> () USES NODE: aten::copy(float[](a) self) -> float[] USES NODE: aten::extend(bool[](a!) self, bool [] other) -> () USES NODE: aten::copy(bool[](a) self) -> bool[] USES NODE: aten::extend(t[](a!) self, t [] other) -> () USES NODE: aten::copy(t[](a) self) -> t[] USES NODE: aten::keys(Dict(str, t) self) -> str[]() USES NODE: aten::values(Dict(str, t) self) -> t[]() USES NODE: aten::dict((str, tVal)[] inputs) -> Dict(str, tVal) USES NODE: aten::keys(Dict(int, t) self) -> int[]() USES NODE: aten::values(Dict(int, t) self) -> t[]() USES NODE: aten::dict((int, tVal)[] inputs) -> Dict(int, tVal) USES NODE: aten::keys(Dict(float, t) self) -> float[]() USES NODE: aten::values(Dict(float, t) self) -> t[]() USES NODE: aten::dict((float, tVal)[] inputs) -> Dict(float, tVal) USES NODE: aten::keys(Dict(Tensor, t) self) -> Tensor[]() USES NODE: aten::values(Dict(Tensor, t) self) -> t[]() USES NODE: aten::dict((Tensor, tVal)[] inputs) -> Dict(Tensor, tVal) USES NODE: aten::test_vartype2(t a, t[] b) -> (t[]) USES NODE: aten::_ncf_unsqueeze(Tensor self, int ndim) -> Tensor USES NODE: aten::_ncf_view(Tensor self, int[] input_shape, int normalized_ndim) -> Tensor USES NODE: prim::is_none(int? a) -> bool USES NODE: aten::__interpolate(Tensor input, int? size = None, float[]? scale_factor = None, str mode = 'nearest', bool? align_corners = None, bool? recompute_scale_factor = None) -> Tensor USES NODE: aten::__interpolate(Tensor input, int[]? size = None, float[]? scale_factor = None, str mode = 'nearest', bool? align_corners = None, bool? recompute_scale_factor = None) -> Tensor USES NODE: aten::__interpolate(Tensor input, int? size = None, float? scale_factor = None, str mode = 'nearest', bool? align_corners = None, bool? recompute_scale_factor = None) -> Tensor USES NODE: aten::__interpolate(Tensor input, int[]? size = None, float? scale_factor = None, str mode = 'nearest', bool? align_corners = None, bool? recompute_scale_factor = None) -> Tensor USES NODE: aten::sorted(t[](a) self) -> (t[]) USES NODE: aten::sort(t[](a!) self, bool reverse=False) -> () USES NODE: aten::test_vartype(t[] a, t b) -> (t) USES NODE: prim::unchecked_unwrap_optional(t(a)? optional) -> t(a) USES NODE: prim::unchecked_cast(...) -> (...) USES NODE: aten::dict() -> Dict(str, Tensor) USES NODE: prim::Load(...) -> (...) USES NODE: prim::Store(...) -> (...) USES NODE: prim::Drop(...) -> (...) USES NODE: aten::tensor(t[] data, , ScalarType? dtype=None, Device? device=None, bool requires_grad=False) -> Tensor USES NODE: aten::as_tensor(t[] data, *, ScalarType? dtype=None, Device? device=None) -> Tensor ``` Test Plan: Imported from OSS Differential Revision: D19615387 Pulled By: zdevito fbshipit-source-id: 95298c3c4249b9f812c332d13f0fb79daeecb662	2020-02-12 14:49:02 -08:00
Elias Ellison	25d33a2ee8	[JIT] Use Type Level Granularity in Alias Analysis Wildcards (#32251 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32251 Previously wildcard sets were associated by TypeKind, meaning all Lists were in one alias set, all Classes were in one alias set, etc. We can improve analysis by bucketing wildcard sets by TypePtr instead. Any two mutable types which can unify should be in the same wildcard set bucket. This also allows us do much simpler `mayContainAlias` analysis, and also improves `analyzeConservative` analysis because now we can recurse through all contained memory locations and mark writes, instead of just recursing only level deep in contained elements. Test Plan: Imported from OSS Differential Revision: D19563263 Pulled By: eellison fbshipit-source-id: 371a37d1a8596abc6c53f41c09840b6c140ea362	2020-01-28 18:07:48 -08:00
generatedunixname89002005287564	9482683065	Remove dead includes in caffe2/test Reviewed By: ezyang Differential Revision: D19273220 fbshipit-source-id: 3dfc3388914e60611c84472e3fc529f5b5e40534	2020-01-21 11:30:34 -08:00
Zachary DeVito	14593f077f	remove list specialization from ivalue (#30734 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30734 What are specialized lists? The IValues that hold List[int], List[Tensor], and List[AnythingElse] are different C++ types. e.g. List[int] has a std::vector<int> while List[AnythingElse] holds a std::vector<IValue>. Why do we have specialized lists? When we first created the JIT we needed to bind the ATen C++ API which has std::vector<int>, std::vector<Tensor> as inputs. The easiest way to match this API was to make our IValues contain these same types. Conversion was just unwrapping the IValue, very easy and cheap. What is the problem with specialized lists? We end up with significant special cases through the compiler. Other types like Dict are not specialized. So in the Pickler, for instance, there is a single piece of logic to handle their serialization. For Lists, we end up with multiple cases. Furthermore, it doesn't match Python, leading to problems along translation boundaries. Our pickle serialization is slightly different than python, so it is harder to load objects from our IValue serialization as Python values. They also make it harder to provide an easy-to-use user API. We'd like to match pybind11 for C++ bindings to TorchScript. This would entail having a single torch::List class (untemplated) that can be used to construct inputs. This is made much harder if the underlying ivalue needs to be different depending on the type inside the list. The ideal case would be to have a constructor like ``` template<typename T> List(std::vector<T> foo); ``` It would then set up the type tags correctly based on type T, without the need for passing tags. Do specialized lists improve perf? Not in a way we have been able to measure. Our major concern initially was having to translate a std::vector<IValue> to std::vector<int> to call ATen functions. This was especially a concern for aten::_convolution which takes a number of mostly-constant lists of integers. However, when we measure the effect of actually having to do this conversion for an aten::_convolution, it does not take measurable time (benchmark results below). This is true even if you use a trivial convolution (e.g. 1x1x1), and comment out the actual convolution code. What are the issues removing them? This PR removes list specialization but keeps the serialization format, and IValue APIs almost exactly the same. The only visible change is that toTensorListRef and family have turned into toTensorVector because they now return by value a copy of the list as a vector. Further PRs can then clean up the complexity issues that arose from speclization. This will likely involve removing the isTensorList/isIntList functions, and refactoring the code that used them to work generically. At some point we will also change serialization to no longer write specialized lists in the pickle binary. This is forward incompatible, so will go in its own PR. Benchmark: ``` import torch import torch.nn as nn import torch.nn.functional as F import time class MnistNet(nn.Module): def __init__(self): super(MnistNet, self).__init__() self.conv1 = nn.Conv2d(1, 1, kernel_size=1) self.conv2 = nn.Conv2d(1, 1, kernel_size=1) def forward(self, x): for i in range(10): x = F.relu(self.conv1(x)) x = F.relu(self.conv2(x)) return x model = MnistNet() x = torch.rand(1, 1, 1, 1) r = torch.jit.trace(model, x ) r(x) r(x) r(x) r(x) print(torch.jit.last_executed_optimized_graph()) while True: b = time.time() for i in range(100): r(x) e = time.time() print(e - b) ``` Results (no observable difference): ``` Before (actual conv) 0.13251137733459473 0.13260436058044434 0.13276338577270508 0.1327497959136963 0.13250041007995605 0.13270330429077148 0.13290190696716309 0.13265132904052734 0.13274288177490234 0.1326758861541748 0.13253355026245117 0.13254785537719727 0.13260746002197266 0.13285017013549805 0.13264012336730957 0.132490873336792 0.13280034065246582 0.13243484497070312 0.1325232982635498 0.1326127052307129 0.13264131546020508 0.13274383544921875 0.13298296928405762 0.1326909065246582 ------------------- After (actual conv) 0.13127517700195312 0.13150334358215332 0.13092470169067383 0.13102364540100098 0.13134360313415527 0.13155555725097656 0.13314104080200195 0.13151955604553223 0.13160037994384766 0.1315293312072754 0.13137340545654297 0.13148093223571777 0.131455659866333 0.1327371597290039 0.13134026527404785 0.13152337074279785 0.13151192665100098 0.13165974617004395 0.13403725624084473 0.13251852989196777 0.13135504722595215 0.1315624713897705 0.1317615509033203 0.1314380168914795 0.13157200813293457 -------------------- The following replace the convolution operator with a no-op, to show that even if the conv op was made faster, then we still would not see a difference: Before (fake conv) 0.0069539546966552734 0.0069522857666015625 0.007120847702026367 0.007344722747802734 0.007689952850341797 0.007932662963867188 0.00761723518371582 0.007501363754272461 0.007532835006713867 0.007141828536987305 0.007174253463745117 0.007114410400390625 0.007071495056152344 ------------------ After (fake conv) 0.007458209991455078 0.007337093353271484 0.007268190383911133 0.007313251495361328 0.007306575775146484 0.007468700408935547 0.0073091983795166016 0.007308483123779297 0.007538318634033203 0.007356882095336914 0.007464170455932617 0.007372140884399414 ``` Test Plan: Imported from OSS Differential Revision: D18814702 Pulled By: zdevito fbshipit-source-id: 0371c73b63068fdc12f24b801371ea90f23531a6	2020-01-12 18:28:25 -08:00
davidriazati	3c07eb33bb	Better error for `torch::jit::load`ing a eager file (#31709 ) Summary: This adds a check to catch the case where someone `torch.save`s something then `torch::jit::load`s it in C++. Relevant for #31620 ](https://our.intern.facebook.com/intern/diff/19252172/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/31709 Pulled By: driazati Differential Revision: D19252172 fbshipit-source-id: f2a9b4442647285418b2778306629b4ff77c15e5	2020-01-07 16:20:42 -08:00

1 2

66 Commits