pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Ivan Kobzarev	2fc73622f8	[jit] Support Awaitable type (#90863 ) We want to make TorchRec sharded models TorchScriptable. TorchRec sharded models uses generic types Awaitable[W] and LazyAwaitable[W] (https://github.com/pytorch/torchrec/blob/main/torchrec/distributed/types.py#L212). In sharded model those types are used instead of contained type W, having the initialization function that produces object of type W. At the moment when the first attribute of W is requested - `LazyAwaitable[W]` will call its initialization function (on the same stack), cache the result inside and work transparently as an object of W. So we can think about it as a delayed object initialization. To support this behavior in TorchScript - we propose a new type to TorchScript - `Await`. In eager mode it works the same as `LazyAwaitable[W]` in TorchRec, being dynamically typed - acting as a type `W` while it is `Await[W]`. Within torchscript it is `Await[W]` and can be only explicitly converted to W, using special function `torch.jit.awaitable_wait(aw)`. Creation of this `Await[W]` is done via another special function `torch.jit.awaitable(func, args)`. The semantic is close to `torch.jit.Future`, fork, wait and uses the same jit mechanics (inline fork Closures) with the difference that it does not start this function in parallel on fork. It only stores as a lambda inside IValue that will be called on the same thread when `torch.jit.awaitable_wait` is called. For example (more examples in this PR `test/jit/test_await.py`) ``` def delayed(z: Tensor) -> Tensor: return Tensor 3 @torch.jit.script def fn(x: Tensor): aw: Await[int] = torch.jit._awaitable(delayed, 99) a = torch.eye(2) b = torch.jit._awaitable_wait(aw) return a + b + x ``` Functions semantics: `_awaitable(func -> Callable[Tuple[...], W], args, *kwargs) -> Await[W]` Creates Await object, owns args and kwargs. Once _awaitable_wait calls, executes function func and owns the result of the function. Following _awaitable_wait calls will return this result from the first function call. `_awaitable_wait(Await[W]) -> W` Returns either cached result of W if it is not the first _awaitable_wait call to this Await object or calls specified function if the first. `_awaitable_nowait(W) -> Await[W]` Creates trivial Await[W] wrapper on specified object To be type complaint for the corner cases. Differential Revision: [D42502706](https://our.internmc.facebook.com/intern/diff/D42502706) Pull Request resolved: https://github.com/pytorch/pytorch/pull/90863 Approved by: https://github.com/davidberard98	2023-01-30 17:38:59 +00:00
Nikita Shulga	8f1c3c68d3	[BE] Use nested namespaces in .cpp/.cu files (#92100 ) As we live in C++17 world This is a functional no-op, just - `s/namespace at { namespace native {/namespace at::native {/` - `s/namespace torch { namespace jit {/namespace torch::jit {/` Pull Request resolved: https://github.com/pytorch/pytorch/pull/92100 Approved by: https://github.com/izaitsevfb	2023-01-13 16:32:34 +00:00
BowenBao	66745831d7	[ONNX] Support constant 'aten::__contains__' (#91660 ) #84624 introduces an update on `torch.norm` [dispatch logic](`eaa43d9f25/torch/functional.py (L1489)`) which now depends on `layout`. Resulting in regressions to export related operators from TorchScript. This PR resolves the regression by partially supporting a subset use case of `prim::layout` (only `torch.strided`), `aten::__contains__` (only constants) operators. It requires much more effort to properly support other layouts, e.g. `torch.sparse_coo`. Extending JIT types, and supporting related family of ops like `aten::to_sparse`. This is out of the scope of this PR. Fixes #83661 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91660 Approved by: https://github.com/justinchuby, https://github.com/kit1980	2023-01-06 01:39:32 +00:00
Aaron Gokaslan	3916d7a575	Apply modernize-use-emplace to aten, c10, torch (#91077 ) Apply clang-tidy check modernize-use-emplace. This is slightly more efficient by using an inplace constructor and is the recommended style in parts of the codebase covered by clang-tidy. This just manually applies the check to rest of the codebase. Pinging @ezyang as this is related to my other PRs he reviewed like #89000 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91077 Approved by: https://github.com/ezyang	2022-12-19 07:49:56 +00:00
Wei-Sheng Chin	19d7941e37	Fix Python-bound function signature (torch._C.Graph.addInput) (#88528 ) In pytorch/torch/_C/__init__.pyi, Graph.addInput has signature ```python def addInput(self, name: str) -> Value: ... ``` which doesn't match the corresponding function ```cpp Value* addInput(const std::string& name = "") { return block_->addInput(name); } ``` in python_ir.cpp. This PR aligns the bound function on both C++ and Python sides. Without this PR, mypy will compain whenever a change contains some calls to `addInput`; for example, ![image](https://user-images.githubusercontent.com/3524474/200092086-429b8d63-9321-4d03-b0d6-f4c9bd361756.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88528 Approved by: https://github.com/davidberard98	2022-11-09 01:31:45 +00:00
BowenBao	45274c56a4	[ONNX] Partially re-enable RoiAlign and RoiPool unit tests (#86169 ) This PR depends on https://github.com/pytorch/vision/pull/6685 Pull Request resolved: https://github.com/pytorch/pytorch/pull/86169 Approved by: https://github.com/justinchuby, https://github.com/AllenTiTaiWang, https://github.com/abock	2022-10-13 14:39:44 +00:00
David Berard	1f99bdfcc4	[JIT] Retry - Support scripting torch.is_autocast_enabled() (#82394 ) This adds an `aten::is_autocast_enabled` op into the jit runtime so that autocasting ops can be scripted and called from within jit. Differential Revision: [D38294040](https://our.internmc.facebook.com/intern/diff/D38294040) Pull Request resolved: https://github.com/pytorch/pytorch/pull/82394 Approved by: https://github.com/eellison	2022-08-10 18:26:17 +00:00
PyTorch MergeBot	554b4060aa	Revert "[JIT] Support scripting torch.is_autocast_enabled() (#81305 )" This reverts commit `bcc9084bc4`. Reverted https://github.com/pytorch/pytorch/pull/81305 on behalf of https://github.com/malfet due to Broke lite-intepreter builds, see https://github.com/pytorch/pytorch/runs/7550084494?check_suite_focus=true	2022-07-28 00:02:53 +00:00
David Berard	bcc9084bc4	[JIT] Support scripting torch.is_autocast_enabled() (#81305 ) This adds an `aten::is_autocast_enabled` op into the jit runtime so that autocasting ops can be scripted and called from within jit. Differential Revision: [D37901585](https://our.internmc.facebook.com/intern/diff/D37901585) Pull Request resolved: https://github.com/pytorch/pytorch/pull/81305 Approved by: https://github.com/qihqi, https://github.com/eellison	2022-07-27 22:32:08 +00:00
Michael Andreas Dagitses	acd072967a	canonicalize includes of form <aten/src/ATen/...> Pull Request resolved: https://github.com/pytorch/pytorch/pull/78033 This was never intended to be supported. @override-unit-failures (Note: this ignores all push blocking failures!) Differential Revision: [D36567054](https://our.internmc.facebook.com/intern/diff/D36567054/) Approved by: https://github.com/kit1980	2022-06-16 17:46:45 +00:00
Michael Andreas Dagitses	ab2ca95dd1	turn on -Werror=unused-variable in our Bazel CPU build Summary: We also fix any existing issues. Note that we only do this for the CPU build because nvcc is considered a C++ toolchain but it does not have the same flag support. Adding flags to the GPU build will cause nvcc errors. Test Plan: Built locally, rely on CI to confirm. Reviewers: malfet Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/79156 Approved by: https://github.com/seemethere, https://github.com/osalpekar, https://github.com/albanD	2022-06-11 02:46:34 +00:00
Elias Ellison	678213ead2	Fake Tensor Part 1 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77969 Approved by: https://github.com/ezyang	2022-05-31 16:20:35 +00:00
David Berard	d0dc7cb774	Reland "[JIT] during freezing, cast optional bias to half if weight is half" Original PR: #77295 Original commit message: On GPU, conv errors if not all its inputs have the same dtype. In the case of autocasting during freezing, what we see is: 1) inputs to conv are casted to half 2) inputs to batchnorm are not casted, so many are still floats 3) we try to fold conv + batchnorm, by finding different weight and bias such that conv(input, new_weight, new_bias) is equivalent to the original conv -> batchnorm. If conv previously had an optional bias, then during freezing we will temporarily create a zero-valued bias as a placeholder for conv_bias. We want to construct it to have the same dtype as the weight input to conv, to avoid errors on GPU. Reland changes: There's a memory leak from cuda caching allocator that is a side effect of this fix. The memory leak causes the test to fail, though for some reason it didn't fail on CI in the last PR. This skips the tests for now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77617 Approved by: https://github.com/eellison	2022-05-17 12:25:26 +00:00
PyTorch MergeBot	246078e251	Revert "[JIT] during freezing, cast optional bias to half if weight is half" This reverts commit `2547be5135`. Reverted https://github.com/pytorch/pytorch/pull/77295 on behalf of https://github.com/malfet	2022-05-17 00:34:51 +00:00
David Berard	2547be5135	[JIT] during freezing, cast optional bias to half if weight is half On GPU, conv errors if not all its inputs have the same dtype. In the case of autocasting during freezing, what we see is: 1) inputs to conv are casted to half 2) inputs to batchnorm are not casted, so many are still floats 3) we try to fold conv + batchnorm, by finding different weight and bias such that conv(input, new_weight, new_bias) is equivalent to the original conv -> batchnorm. If conv previously had an optional bias, then during freezing we will temporarily create a zero-valued bias as a placeholder for conv_bias. We want to construct it to have the same dtype as the weight input to conv, to avoid errors on GPU. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77295 Approved by: https://github.com/eellison	2022-05-16 22:18:47 +00:00
max	25a6aabe71	Expose permute inputs (#77391 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/77391 Approved by: https://github.com/eellison	2022-05-13 22:18:51 +00:00
BowenBao	679fc90cdb	[ONNX] Support optional type (#68793 ) (#73284 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73284 Some important ops won't support optional type until opset 16, so we can't fully test things end-to-end, but I believe this should be all that's needed. Once ONNX Runtime supports opset 16, we can do more testing and fix any remaining bugs. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D34625646 Pulled By: malfet fbshipit-source-id: 537fcbc1e9d87686cc61f5bd66a997e99cec287b Co-authored-by: BowenBao <bowbao@microsoft.com> Co-authored-by: neginraoof <neginmr@utexas.edu> Co-authored-by: Nikita Shulga <nshulga@fb.com> (cherry picked from commit 822e79f31ae54d73407f34f166b654f4ba115ea5)	2022-05-04 20:24:30 +00:00
Nikolay Korovaiko	5177f95d21	Introducing SymInt to Pytorch (for tracing size arithmetic) (master rebase) (#74861 ) Summary: This PR introduces `SymInt` type to Pytorch which will be used by LTC and AOTAutograd for tracing size arithmetic and tests. `SymInt` is a C++ union structure [int64_t, SymbolicIntNode*] that wraps around an int64_t field where the value of the field could be an index into a list of `shared_ptr<SymbolicIntNode>` or a real int. This PR doesn't add any support for actually tracing symbolic ints. i.e. data_ for now can only contain real ints. ``` Goal 1: just to show we can add a type to PyTorch core. (wraps int) LANDEABLE Finalize the naming - symint Want the name to be short Does invoke “size” - NO SInt/SymInt/SymbolicInt SInt could mean signed int sym_int or symint or SymInt (originally it was “int”; capitalized implies object semantics, whereas lowercase implies value semantics) JIT schema - symint C++ - symint ``` See more details here: https://docs.google.com/document/d/1iiLNwR5ohAsw_ymfnOpDsyF6L9RTUaHMpD8 (`d843f63f2a`)YLw-jxEw Pull Request resolved: https://github.com/pytorch/pytorch/pull/74861 Reviewed By: qihqi, ngimel Differential Revision: D35226230 Pulled By: Krovatkin fbshipit-source-id: 34acf342bd50fcaa4d8d5dd49c2fd6a98823a5b3 (cherry picked from commit 218643f63ef181cabb92d13a6e837eb64f2dda3c)	2022-03-31 21:59:59 +00:00
BowenBao	54a6942f8d	[ONNX] ONNX Exporter logging (#71342 ) Summary: Add ONNX exporter logging facility. Supporting both C++/Python logging api. Logging can be turned on/off. Logging output stream can be either set to `stdout` or `stderr`. A few other changes: * When exception is raised in passes, the current IR graph being processed will be logged. * When exception is raised from `_jit_pass_onnx` (the pass that converts nodes from namespace `ATen` to `ONNX`), both ATen IR graph and ONNX IR graph under construction will be logged. * Exception message for ConstantFolding is truncated to avoid being too verbose. * Update the final printed IR graph with node name in ONNX ModelProto as node attribute. Torch IR Node does not have name. Adding this to printed IR graph helps debugging. Pull Request resolved: https://github.com/pytorch/pytorch/pull/71342 Reviewed By: msaroufim Differential Revision: D34433473 Pulled By: malfet fbshipit-source-id: 4b137dfd6a33eb681a5f2612f19aadf5dfe3d84a (cherry picked from commit 67a8ebed5192c266f604bdcca931df6fe589699f)	2022-03-17 19:40:03 +00:00
David Berard	b27ec57331	[JIT] script & logging for extracting IR from logs (#72889 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72889 The script along with the GRAPH_EXPORT macro will allow for an easy way to extract IR from logs. One use case in this diff is to extract the fusion groups from nvfuser, so that the fusions can be tested individually. Usage (e.g. for nvfuser test) 1. Write some test.py file that uses nvfuser 2. `PYTORCH_JIT_LOG_LEVEL=">>graph_fuser" python3 test.py 2>&1 \| tee output.txt` 3. `python3 pytorch/scripts/jit/log_extract.py output.txt --nvfuser` This will run with and without nvfuser to compare the output. Alternatively, use `--output` to dump the IR so that it can be used in other applications. Currently, only `--output` works (since generating input tensors is not supported) Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D34440189 Pulled By: davidberard98 fbshipit-source-id: fca0f619200ee37aba34bb39b69e6c640c263e26 (cherry picked from commit eb319166075db160f1628f0de545641fbecde8be)	2022-03-02 18:34:35 +00:00
Elias Ellison	ab6395fc65	Add api for recursively analyzing function calls (#73329 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73329 There is a quantization use case for having better alias analysis with function calls remaining. This does the relatively dumb approach of getting the inlined graph of each function call, and then analyzing that subgraph. Since we need a unique single analysis of every `Value*`, for every function call make a copy of the graph for every analysis past the first. This is relatively slow, but given the limited use case here should work well enough (and is no slower than calling the inlining pass). cc vkuzo Test Plan: Imported from OSS Reviewed By: davidberard98 Differential Revision: D34451424 Pulled By: eellison fbshipit-source-id: b7c7e54679d723f5ded1e11ffb32eb6d2176431d (cherry picked from commit 81a42b31522b890311a3f512448b372c4ebbefd1)	2022-02-28 17:44:45 +00:00
Elias Ellison	8bc28e9c9c	[JIT] Add more python ir utilities (#69871 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69871 Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D33515232 Pulled By: eellison fbshipit-source-id: d48da7b398a3f1a8862789484a4035d874196763 (cherry picked from commit e5976b8b7a4995be25a93601bbae5c52d6d3fca8)	2022-02-25 01:07:05 +00:00
BowenBao	04c5d978b9	[ONNX] Refactor _run_symbolic_function (#67573 ) (#68491 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68491 * Allows implementing symbolic functions for domains other than `aten`, for example `prim`, in symbolic_opset#.py. * Allows symbolic function to access extra context if needed, through `SymbolicFunctionState`. * Particularly, the `prim::PythonOp` special case can access node without the need of passing node through inputs. Updates will be made downstreams, and in a follow-up PR we will remove the previous workaround in exporter. * `prim::Loop`, `prim::If`, etc are now moved outside of `_run_symbolic_function` from utils.py, and to symbolic_opset9.py. Motivation for this change: - Better maintainability and reducing complexity. Easier to add symbolic for operators, both simple and complex ones (that need additional context), without the former needing to know the existence of the latter. - The design idea was long outdated. prim ops are no longer rare special cases, and they shouldn't all be handled inside `_run_symbolic_function`. As a result this function becomes too clumsy. There were also prim ops symbolic added in symbolic_opset#.py with signature `prim_[opname]`, creating separation and confusion. Test Plan: Imported from OSS Reviewed By: jansel Differential Revision: D32483782 Pulled By: malfet fbshipit-source-id: f9affc31b1570af30ffa6668da9375da111fd54a Co-authored-by: BowenBao <bowbao@microsoft.com> (cherry picked from commit `1e04ffd2fd`)	2022-02-11 18:35:35 +00:00
Elias Ellison	59a6375639	[NNC] Add Tests for Dynamic Shape Fusion Change default fusion strategy (#71651 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71651 The only tests that regress are because chunk NYI, the other tests that I touched were passing just because the `assertAllFused` wasn't working correctly. That, and we're no longer compiling conv/matmul w dynamic shapes Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D33801500 Pulled By: eellison fbshipit-source-id: 074118ab4a975b7db876a4fcdfb9483afb879e79 (cherry picked from commit `abaa7948c1`)	2022-02-01 19:07:02 +00:00
CodemodService FBSourceClangFormatLinterBot	88012c7daf	[AutoAccept][Codemod][FBSourceClangFormatLinter] Daily `arc lint --take CLANGFORMAT` Reviewed By: zertosh Differential Revision: D33577744 fbshipit-source-id: 7ecc8367998ee1dffde54c2f4dd3cfafe19a53c9	2022-01-14 06:10:57 -08:00
John Clow	ade83ed90c	Building Default Inference for Device Type (#69049 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69049 Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D33555885 Pulled By: Gamrix fbshipit-source-id: 7364066cbc544ab8442a47c82ea89f0e73eaaa06	2022-01-13 13:57:08 -08:00
Elias Ellison	97e8dcba5e	Fix mis-specified device arg name (#69645 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69645 As noted in code comment: existing device operator is registered with input name `a`, which prevents torch.device(type="cuda") from working. add shim-layer here Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D33515231 Pulled By: eellison fbshipit-source-id: c04af8158a9568a20cd5fbbbd573f6efab98fd60	2022-01-11 22:11:24 -08:00
Scott Wolchok	ddea6980fe	[PyTorch][JIT] Don't refcount Type singletons (#69579 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69579 This should help us avoid reference counting overhead on singleton Type subclasses without a major rewrite of the Type subsystem. ghstack-source-id: 146643993 Test Plan: Ran //caffe2/caffe2/fb/high_perf_models/pytorch/benchmark_framework_overheads:cpp_benchmark with arguments `--op empty -niter 40 --stressTestRecordFunction --captureRecordFunctionInputs` on devbig with turbo off. Before: ``` I1206 13:47:15.037441 1201670 bench.cpp:144] Mean 0.737675 I1206 13:47:15.037463 1201670 bench.cpp:145] Median 0.736725 I1206 13:47:15.037468 1201670 bench.cpp:146] Min 0.722897 I1206 13:47:15.037473 1201670 bench.cpp:147] stddev 0.00508187 I1206 13:47:15.037482 1201670 bench.cpp:148] stddev / mean 0.00688903 ``` After: ``` I1206 13:48:16.830123 1205612 bench.cpp:144] Mean 0.66988 I1206 13:48:16.830150 1205612 bench.cpp:145] Median 0.663956 I1206 13:48:16.830157 1205612 bench.cpp:146] Min 0.65986 I1206 13:48:16.830164 1205612 bench.cpp:147] stddev 0.0335928 I1206 13:48:16.830171 1205612 bench.cpp:148] stddev / mean 0.0501475 ``` Static runtime startup is also improved; for CMF local_ro, time to initialize a predictor went from 10.01s to 9.59s. (Note: I wish I had a production workload to demonstrate the advantage of this on. I tried ctr_mobile_feed local_ro net but it was neutral. Anything that manipulates types or List/Dict a lot might be promising.) Reviewed By: suo Differential Revision: D32923880 fbshipit-source-id: c82ed6689b3598e61047fbcb2149982173127ff0	2022-01-06 17:39:16 -08:00
Deyu Huang	d32efe8bc2	[ONNX] Remove the argument use_external_data_format of export() method entirely. (#67080 ) (#67811 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67811 * remove the argument use_external_data_format of export() method entirely Test Plan: Imported from OSS Reviewed By: msaroufim Differential Revision: D32181302 Pulled By: malfet fbshipit-source-id: 4bc1448b7487bb9dfdad4e36008ff5b227fd64a3 Co-authored-by: hwangdeyu <dejack953@outlook.com>	2021-11-15 17:20:04 -08:00
Thomas Viehmann	be281fc597	Check for None in torch.jit.Graph.create (#68253 ) Summary: ...because we don't like segfaults from Python (see test). Pull Request resolved: https://github.com/pytorch/pytorch/pull/68253 Reviewed By: suo Differential Revision: D32396747 Pulled By: gmagogsfm fbshipit-source-id: a0925e8479702766e88176280985a63bc79e4f6a	2021-11-13 11:30:33 -08:00
Elias Ellison	6b44e75f6b	aliasing fixes (#66977 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66977 Fix for https://github.com/pytorch/pytorch/issues/47218 More context is in original PR here: https://github.com/pytorch/pytorch/pull/20556 Test Plan: Imported from OSS Reviewed By: malfet, albanD Differential Revision: D31935573 Pulled By: eellison fbshipit-source-id: 3658d5711116396c35f1d5016773b0096ed347a5	2021-11-09 18:33:37 -08:00
Bowen Bao	02e35ce17b	[ONNX] Update onnx function export with comments and clean up (#66817 ) (#67803 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67803 * Addresses comments from #63589 [ONNX] remove torch::onnx::PRODUCER_VERSION (#67107) Use constants from version.h instead. This simplifies things since we no longer have to update PRODUCER_VERSION for each release. Also add TORCH_VERSION to version.h so that a string is available for this purpose. [ONNX] Set `ir_version` based on opset_version. (#67128) This increases the odds that the exported ONNX model will be usable. Before this change, we were setting the IR version to a value which may be higher than what the model consumer supports. Also some minor clean-up in the test code: * Fix string replacement. * Use a temporary file so as to not leave files around in the test current working directory. Test Plan: Imported from OSS Reviewed By: msaroufim Differential Revision: D32181306 Pulled By: malfet fbshipit-source-id: 02f136d34ef8f664ade0bc1985a584f0e8c2b663 Co-authored-by: BowenBao <bowbao@microsoft.com> Co-authored-by: Gary Miguel <garymiguel@microsoft.com> Co-authored-by: Nikita Shulga <nshulga@fb.com>	2021-11-05 10:35:35 -07:00
John Clow	ec8a71f9ac	Dtype Analysis for Unary and Binary ops with Metatensors (#66898 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66898 Test Plan: Imported from OSS Reviewed By: malfet Differential Revision: D32175961 Pulled By: Gamrix fbshipit-source-id: 72721259b900e5a311b6bcb5c350366ba420b734	2021-11-04 19:00:50 -07:00
Elias Ellison	2486061c72	[JIT] make x (+ or -) 0 and x (* or /) 1 peepholes type promotion aware (#67688 ) Summary: Some of the "no-ops" are not actually no-ops because they can change the dtype Pull Request resolved: https://github.com/pytorch/pytorch/pull/67688 Reviewed By: davidberard98 Differential Revision: D32104601 Pulled By: eellison fbshipit-source-id: ccb99179a4b30fd20b5a9228374584f2cdc8ec21	2021-11-03 20:11:46 -07:00
Zhengxu Chen	059ae96007	[jit] Factor findAllNodes into one place. (#65965 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65965 ghstack-source-id: 141504185 Test Plan: no behavior change Reviewed By: qihqi, ejguan Differential Revision: D31326152 fbshipit-source-id: 2e0261a96853bfb67a96dd68972c905b6b26d562	2021-10-25 15:42:52 -07:00
Nikita Shulga	53a163a015	[ONNX] Export nn.Module call as ONNX local function (#63589 ) (#66140 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66140 * Add new argument to export api to enable users specifying `nn.Module` classes that they wish to be exported as local function in ONNX model. * Refactor `torch/csrc/jit/serialization/export.cpp`, and remove redundant `EncoderBase` class. * ~~Contains changes from #63268~~ * Depends on #63716 to update onnx submodule. Test Plan: Imported from OSS Reviewed By: jansel Differential Revision: D31424098 fbshipit-source-id: c949d0b01c206c30b4182c2dd1a5b90e32b7a0d3 Co-authored-by: BowenBao <bowbao@microsoft.com>	2021-10-22 13:44:56 -07:00
Elias Ellison	63b41e1f4d	[JIT] Add partial evaluation graph stitching logic (#65377 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65377 When we run symbolic shape analysis on ``` conv = torch.nn.Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False) max_pool = torch.nn.MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False) mod = nn.Sequential(conv1, max_pool) ... graph(%self : __torch__.torch.nn.modules.container.___torch_mangle_0.Sequential, %input.1 : Tensor): %18 : bool = prim::Constant[value=0]() %30 : int[] = prim::Constant[value=[1, 1]]() %29 : int[] = prim::Constant[value=[3, 3]]() %28 : int[] = prim::Constant[value=[2, 2]]() %6 : int = prim::Constant[value=1]() %self.0.bias : NoneType = prim::Constant() %self.0.weight : Double(64, 3, 7, 7, strides=[147, 49, 7, 1], requires_grad=0, device=cpu) = prim::Constant[value=<Tensor>]() %input.5 : Tensor(SS(-2), 64, SS(-3), SS(-4)) = aten::conv2d(%input.1, %self.0.weight, %self.0.bias, %28, %29, %30, %6) %input.9 : Tensor(SS(-2), 64, SS(-5), SS(-6)) = aten::max_pool2d(%input.5, %29, %28, %30, %30, %18) return (%input.9) ``` we partially evaluate the shape compute graph of `conv2d`, whose output gets passed in and used to partially evaluate the shape compute graph of `max_pool2d`. The conv2d remaining partially eval'd graph is [here](https://gist.github.com/eellison/0598bd224a422211efa1a45d2b7560b7), and the maxpool2d eval'd graph is [here](https://gist.github.com/eellison/625540b84f650ddbefd3ae5511ab8814). We can take the partially eval'd graphs of a series of operators and stitch them together, which allows us to a) recover symbolic equivalences by CSE'ing & other optimizations b) calculate shapes for a whole block of operators just on the input, such as for fusing the whole model to nnc with dynamic shapes and then passing along the computed symbolic shapes. the calculation will also handle error handling. c) (future-looking) generate inputs on demand for straight-line networks that are composed just of aten operators The combined graph of the two gives us compute for the unknown symbolic dimensions - `SS(-2), SS(-3), SS(-4), SS(-5), and SS(-6)`. ``` graph(%input.1 : int[]): %42 : bool = prim::Constant[value=0]() # <string>:152:17 %15 : int = prim::Constant[value=3]() %input_batch_size_dim.1 : int = prim::Constant[value=0]() # <string>:417:41 %13 : int = prim::Constant[value=1]() # <string>:426:61 %12 : int = prim::Constant[value=4]() # <string>:437:32 %11 : str = prim::Constant[value="AssertionError: "]() %9 : int = prim::Constant[value=2]() %8 : int = prim::Constant[value=6]() %7 : int = prim::Constant[value=7]() %16 : int = aten::len(%input.1) # <string>:438:17 %17 : bool = aten::eq(%16, %12) # <string>:438:17 = prim::If(%17) # <string>:438:10 block0(): -> () block1(): = prim::RaiseException(%11) # <string>:438:10 -> () %18 : int = aten::__getitem__(%input.1, %13) # <string>:407:17 %19 : bool = aten::eq(%18, %15) # <string>:407:17 = prim::If(%19) # <string>:407:10 block0(): -> () block1(): = prim::RaiseException(%11) # <string>:407:10 -> () %20 : int = aten::__getitem__(%input.1, %9) # <string>:411:20 %21 : int = aten::add(%20, %8) # <string>:411:20 %22 : bool = aten::ge(%21, %7) # <string>:411:20 = prim::If(%22) # <string>:411:12 block0(): -> () block1(): = prim::RaiseException(%11) # <string>:411:12 -> () %23 : int = aten::__getitem__(%input.1, %15) # <string>:411:20 %24 : int = aten::add(%23, %8) # <string>:411:20 %25 : bool = aten::ge(%24, %7) # <string>:411:20 = prim::If(%25) # <string>:411:12 block0(): -> () block1(): = prim::RaiseException(%11) # <string>:411:12 -> () %26 : int = aten::__getitem__(%input.1, %input_batch_size_dim.1) # <string>:422:29 %27 : int = aten::sub(%20, %13) # <string>:428:32 %28 : int = aten::floordiv(%27, %9) # <string>:428:32 %29 : int = aten::add(%28, %13) # <string>:428:32 %30 : int = aten::sub(%23, %13) # <string>:428:32 %31 : int = aten::floordiv(%30, %9) # <string>:428:32 %32 : int = aten::add(%31, %13) # <string>:428:32 %48 : int = aten::floordiv(%28, %9) # <string>:133:17 %outputSize.2 : int = aten::add(%48, %13) # <string>:136:23 %51 : int = aten::floordiv(%31, %9) # <string>:133:17 %outputSize.1 : int = aten::add(%51, %13) # <string>:136:23 %53 : bool = aten::ne(%29, %input_batch_size_dim.1) # <string>:156:41 %54 : bool = prim::If(%53) # <string>:157:64 block0(): %55 : bool = aten::ne(%32, %input_batch_size_dim.1) # <string>:157:93 -> (%55) block1(): -> (%42) = prim::If(%54) # <string>:157:10 block0(): -> () block1(): = prim::RaiseException(%11) # <string>:157:10 -> () %56 : bool = aten::ge(%outputSize.1, %13) # <string>:160:17 %57 : bool = prim::If(%56) # <string>:160:17 block0(): %58 : bool = aten::ge(%outputSize.2, %13) # <string>:160:38 -> (%58) block1(): -> (%42) = prim::If(%57) # <string>:160:10 block0(): -> () block1(): = prim::RaiseException(%11) # <string>:160:10 -> () return (%26, %29, %32, %outputSize.2, %outputSize.1) ``` This PR runs shape analysis, retains the partially evaluated graphs, and then stitches them together, keeping track of what inputs in the partial eval graph correspond to what inputs in the encompassing graph IR and what outputs correspond to what symbolic shape. Adding NNC ppl as reviewers because it is relevant to dynamic shape fusion. Question for reviewers : should I make this a separate file ? Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D31797472 Pulled By: eellison fbshipit-source-id: a41ed31fad085d3563e71c815f49af0cd18aaeed	2021-10-20 16:12:58 -07:00
Michael Suo	70c9eb130d	Revert D31732419: [JIT] Add partial evaluation graph stitching logic Test Plan: revert-hammer Differential Revision: D31732419 (`5db7db667f`) Original commit changeset: 883a55cbeef0 fbshipit-source-id: f5faba69dfb6b54aeb29d1beaeec8c5b0373830f	2021-10-19 20:07:04 -07:00
Elias Ellison	5db7db667f	[JIT] Add partial evaluation graph stitching logic (#65377 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65377 When we run symbolic shape analysis on ``` conv = torch.nn.Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False) max_pool = torch.nn.MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False) mod = nn.Sequential(conv1, max_pool) ... graph(%self : __torch__.torch.nn.modules.container.___torch_mangle_0.Sequential, %input.1 : Tensor): %18 : bool = prim::Constant[value=0]() %30 : int[] = prim::Constant[value=[1, 1]]() %29 : int[] = prim::Constant[value=[3, 3]]() %28 : int[] = prim::Constant[value=[2, 2]]() %6 : int = prim::Constant[value=1]() %self.0.bias : NoneType = prim::Constant() %self.0.weight : Double(64, 3, 7, 7, strides=[147, 49, 7, 1], requires_grad=0, device=cpu) = prim::Constant[value=<Tensor>]() %input.5 : Tensor(SS(-2), 64, SS(-3), SS(-4)) = aten::conv2d(%input.1, %self.0.weight, %self.0.bias, %28, %29, %30, %6) %input.9 : Tensor(SS(-2), 64, SS(-5), SS(-6)) = aten::max_pool2d(%input.5, %29, %28, %30, %30, %18) return (%input.9) ``` we partially evaluate the shape compute graph of `conv2d`, whose output gets passed in and used to partially evaluate the shape compute graph of `max_pool2d`. The conv2d remaining partially eval'd graph is [here](https://gist.github.com/eellison/0598bd224a422211efa1a45d2b7560b7), and the maxpool2d eval'd graph is [here](https://gist.github.com/eellison/625540b84f650ddbefd3ae5511ab8814). We can take the partially eval'd graphs of a series of operators and stitch them together, which allows us to a) recover symbolic equivalences by CSE'ing & other optimizations b) calculate shapes for a whole block of operators just on the input, such as for fusing the whole model to nnc with dynamic shapes and then passing along the computed symbolic shapes. the calculation will also handle error handling. c) (future-looking) generate inputs on demand for straight-line networks that are composed just of aten operators The combined graph of the two gives us compute for the unknown symbolic dimensions - `SS(-2), SS(-3), SS(-4), SS(-5), and SS(-6)`. ``` graph(%input.1 : int[]): %42 : bool = prim::Constant[value=0]() # <string>:152:17 %15 : int = prim::Constant[value=3]() %input_batch_size_dim.1 : int = prim::Constant[value=0]() # <string>:417:41 %13 : int = prim::Constant[value=1]() # <string>:426:61 %12 : int = prim::Constant[value=4]() # <string>:437:32 %11 : str = prim::Constant[value="AssertionError: "]() %9 : int = prim::Constant[value=2]() %8 : int = prim::Constant[value=6]() %7 : int = prim::Constant[value=7]() %16 : int = aten::len(%input.1) # <string>:438:17 %17 : bool = aten::eq(%16, %12) # <string>:438:17 = prim::If(%17) # <string>:438:10 block0(): -> () block1(): = prim::RaiseException(%11) # <string>:438:10 -> () %18 : int = aten::__getitem__(%input.1, %13) # <string>:407:17 %19 : bool = aten::eq(%18, %15) # <string>:407:17 = prim::If(%19) # <string>:407:10 block0(): -> () block1(): = prim::RaiseException(%11) # <string>:407:10 -> () %20 : int = aten::__getitem__(%input.1, %9) # <string>:411:20 %21 : int = aten::add(%20, %8) # <string>:411:20 %22 : bool = aten::ge(%21, %7) # <string>:411:20 = prim::If(%22) # <string>:411:12 block0(): -> () block1(): = prim::RaiseException(%11) # <string>:411:12 -> () %23 : int = aten::__getitem__(%input.1, %15) # <string>:411:20 %24 : int = aten::add(%23, %8) # <string>:411:20 %25 : bool = aten::ge(%24, %7) # <string>:411:20 = prim::If(%25) # <string>:411:12 block0(): -> () block1(): = prim::RaiseException(%11) # <string>:411:12 -> () %26 : int = aten::__getitem__(%input.1, %input_batch_size_dim.1) # <string>:422:29 %27 : int = aten::sub(%20, %13) # <string>:428:32 %28 : int = aten::floordiv(%27, %9) # <string>:428:32 %29 : int = aten::add(%28, %13) # <string>:428:32 %30 : int = aten::sub(%23, %13) # <string>:428:32 %31 : int = aten::floordiv(%30, %9) # <string>:428:32 %32 : int = aten::add(%31, %13) # <string>:428:32 %48 : int = aten::floordiv(%28, %9) # <string>:133:17 %outputSize.2 : int = aten::add(%48, %13) # <string>:136:23 %51 : int = aten::floordiv(%31, %9) # <string>:133:17 %outputSize.1 : int = aten::add(%51, %13) # <string>:136:23 %53 : bool = aten::ne(%29, %input_batch_size_dim.1) # <string>:156:41 %54 : bool = prim::If(%53) # <string>:157:64 block0(): %55 : bool = aten::ne(%32, %input_batch_size_dim.1) # <string>:157:93 -> (%55) block1(): -> (%42) = prim::If(%54) # <string>:157:10 block0(): -> () block1(): = prim::RaiseException(%11) # <string>:157:10 -> () %56 : bool = aten::ge(%outputSize.1, %13) # <string>:160:17 %57 : bool = prim::If(%56) # <string>:160:17 block0(): %58 : bool = aten::ge(%outputSize.2, %13) # <string>:160:38 -> (%58) block1(): -> (%42) = prim::If(%57) # <string>:160:10 block0(): -> () block1(): = prim::RaiseException(%11) # <string>:160:10 -> () return (%26, %29, %32, %outputSize.2, %outputSize.1) ``` This PR runs shape analysis, retains the partially evaluated graphs, and then stitches them together, keeping track of what inputs in the partial eval graph correspond to what inputs in the encompassing graph IR and what outputs correspond to what symbolic shape. Adding NNC ppl as reviewers because it is relevant to dynamic shape fusion. Question for reviewers : should I make this a separate file ? Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D31732419 Pulled By: eellison fbshipit-source-id: 883a55cbeef0fd5a6068a779ffa89b6f537245b3	2021-10-19 16:41:19 -07:00
Gary Miguel	d1058df885	fix clang-tidy error introduced by #64382 (#65977 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65977 Reviewed By: ngimel Differential Revision: D31423174 Pulled By: malfet fbshipit-source-id: 0ea560b9a6ddd6431f70bd3ac10ace68e26ab352	2021-10-05 20:13:13 -07:00
BowenBao	20143bf07f	[ONNX] Deprecate use_external_data_format param from torch.onnx.export() function. (#62257 ) (#64382 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64382 * This `use_external_data_format` parameter is used for large models cannot be exported because of the 2GB protobuf limit. * When `use_external_data_format` set to True, the model is exported in ONNX external data format, in which case some of the model parameters are stored in external binary files and not in the ONNX model file itself. * This PR will set this paramter to DEPRECATED and check the model proto sizes by code instead of by user, if the sizes lager than 2GB, then `use_external_data_format = True` automatically. Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D30905265 Pulled By: malfet fbshipit-source-id: 82b4e17bfa6a8de2bfd700a5282c12f6835603cb Co-authored-by: hwangdeyu <dejack953@outlook.com>	2021-09-23 22:20:48 -07:00
Ansley Ussery	6831d8e379	Support Union in TorchScript (#64234 ) Summary: This PR is created to replace https://github.com/pytorch/pytorch/pull/53180 PR stack, which has all the review discussions. Reason for needing a replacement is due to a messy Sandcastle issue. Pull Request resolved: https://github.com/pytorch/pytorch/pull/64234 Reviewed By: gmagogsfm Differential Revision: D30656444 Pulled By: ansley fbshipit-source-id: 77536c8bcc88162e2c72636026ca3c16891d669a	2021-09-03 06:12:24 -07:00
Elias Ellison	ea808df25d	Test shape analysis with opinfos (#59814 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59814 Using opinfos to test shape analysis. By default, we just check that we don't give incorrect answers, and then if `assert_jit_shape_analysis` is true, tests that we correctly propagates the full shape. and it found a couple bugs {emoji:1f603} Test Plan: Imported from OSS Reviewed By: Krovatkin Differential Revision: D30200058 Pulled By: eellison fbshipit-source-id: 6226be87f5390277cfa5a1fffaa1b072d4bc8803	2021-08-10 09:47:33 -07:00
Kimish Patel	026cfe85b4	Fix InlinedCallStack annotation to account for module calling its own (#61791 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61791 methods from forward During inlining we attached InlinedCallstack to nodes being inlined. In the process we attach moodule information as well, such that if CallMethod is being inlined we know which class instance and class type the method belongs to. However, CallMethod can be calling a method of the same object to which the graph belongs. e.g.: ``` def forward(self, input): x = input + 10 return forward_impl_(x, input) ``` Here forward_impl is method defined on the same class in which forward is defined. Existing module hierarchy annotation will mislabel this as unknown instance since the method is not associated with output of GetAttr node (it would be we had called self.conv.forward_impl_ for example). Change in this PR reconciles this by creating a placeholder name "SELF" for module instance indicating that you can traverse InlinedCallStack backwards to find first node with name != SELF, which would be the name of the object. e.g.: TOP(ResNet)::forward.SELF(ResNet)::_forward_impl.layer1(Sequential)::forward.0(BasicBlock)::forward.conv1(Conv2d)::forward.SELF(Conv2d)::_conv_forward Test Plan: Add test Imported from OSS Reviewed By: larryliu0820 Differential Revision: D29745443 fbshipit-source-id: 1525e41df53913341c4c36a56772454782a0ba93	2021-07-26 15:00:57 -07:00
Nikita Shulga	a9b0a921d5	Disable `avoid-non-const-global-variables` lint check (#62008 ) Summary: As GoogleTest `TEST` macro is non-compliant with it as well as `DEFINE_DISPATCH` All changes but the ones to `.clang-tidy` are generated using following script: ``` for i in `find . -type f -iname ".c" -or -iname "*.h"\|xargs grep cppcoreguidelines-avoid-non-const-global-variables\|cut -f1 -d:\|sort\|uniq`; do sed -i "/\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-non-const-global-variables)/d" $i; done ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/62008 Reviewed By: driazati, r-barnes Differential Revision: D29838584 Pulled By: malfet fbshipit-source-id: 1b2f8602c945bd4ce50a9bfdd204755556e31d13	2021-07-22 18:04:40 -07:00
BowenBao	95a7f3ccfe	[ONNX] Fix shape inference for large model (#59320 ) (#60244 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60244 Do 2GB size check for protocol buffer serialization at a later time, to avoid false alarming for cases like shape inference where no serialization actually happens. Test Plan: Imported from OSS Reviewed By: zou3519, ZolotukhinM Differential Revision: D29494910 Pulled By: SplitInfinity fbshipit-source-id: 4c36d26de9a94e5d6cf78f332d4dffc46588ebf0 Co-authored-by: BowenBao <bowbao@microsoft.com>	2021-07-08 16:29:22 -07:00
Mengwei Liu	10fc58620e	[PyTorch][NASProfiler] Add moduleHierarchy Python API to print out hierarchical information about a Node (#60384 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60384 Currently inlining module graph will drop module hierarchy info on Python side. Here we retrieve the module hierarchy from cpp side and expose it to a new Python API on Node called `moduleHierarchy()`. Test Plan: Usage: ``` torch._C._jit_pass_inline(module.graph) torch._C._jit_pass_propagate_shapes_on_graph(module.graph) node = module.graph.findNode("quantized::conv2d_relu") 'top(' + module.original_name + ').' + node.moduleHierarchy() + '.' + node.kind() ``` Output: ``` 'top(QuantWrapper).module(FBNetHR).0(Sequential).xif0_0(ConvBNRelu).conv(ConvReLU2d).quantized::conv2d_relu' ``` Reviewed By: kimishpatel Differential Revision: D29252169 fbshipit-source-id: 74163a87f919e061e5e75dfebc4c5cdbe8489d93	2021-06-30 01:32:31 -07:00
Elias Ellison	9fd2306036	Add handling of symbolic shapes (#55925 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55925 This sets up the initial handling of symbolic shapes. As in the test, it doesn't work perfectly yet because it needs a couple other optimization passes. The basic description is pretty simple: we resolve tensor dimension indices to the same Value *, and before extracting out the output Tensor shape we substitute in symbolic shapes. We don't substitute during optimization because they are represented as negative numbers so we don't want them inadvertently used in Constant prop or something else. Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D27750996 Pulled By: eellison fbshipit-source-id: 6984e7276b578f96b00fc2025cef0e13f594b6e6	2021-05-21 08:50:52 -07:00
Elias Ellison	f39471a171	Initial Symbolic Shape Analysis (#54809 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54809 I'm going to post on dev-discuss soon with a more thorough explanation of the design and advantages of this shape analysis, so I'm leaving out that for now. There is still a ton left to do, I'm posting this initial version so we can get something on master multiple can work on. List of many remaining steps to do: - [ ] Add symbolic shapes support - [ ] Bind shape functions for operators in C++ - [ ] Make classes of operators share the same shape function (e.g. pointwise, broadcast two inputs) - [ ] Refactor APIs - [ ] Only iteratively optimize shape function while a change has been made - [ ] Expand coverage of coverage to common ops - [ ] Add shape analysis pass on Graph that handles Ifs and Loops - [ ] Allow concurrent reads to the operator map - [ ] Successive applications of same inputs to same shape function (e.g. series of pointwise ops) For this review, I am mostly looking for comments related to the implementation of symolic_shape_analysis.cpp, with the caveats listed above. I am not really looking for comments related to api/registration/graph level analysis as those are all planned to be changed. I am fine landing this as is or waiting until necessary components of the TODOs above are finished. Test Plan: Imported from OSS Reviewed By: pbelevich Differential Revision: D27750998 Pulled By: eellison fbshipit-source-id: 4338b99e8651df076291c6b781c0e36a1bcbec03	2021-05-21 08:49:46 -07:00
BowenBao	346dc88bfa	[ONNX] Support registering custom export for prim::PythonOp from torch.autograd.Function (#55630 ) (#57600 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57600 Demo script: ```python import torch class MyReLU(torch.autograd.Function): staticmethod def forward(ctx, input, scalar_tuple, scalar, scalar_list): ctx.save_for_backward(input) return input.clamp(min=scalar) staticmethod def backward(ctx, grad_output): input, = ctx.saved_tensors grad_input = grad_output.clone() grad_input[input < 0] = 0 return grad_input class MyModule(torch.nn.Module): def __init__(self): super().__init__() self.linear_a = torch.nn.Linear(2, 2) self.linear_b = torch.nn.Linear(2, 2) self.relu = MyReLU.apply def forward(self, x): h = self.linear_a(x) h = self.relu(h, (5, 3), 2, [1, 2, 3]) h = self.linear_b(h) return h """ User define how to export prim::PythonOp into custom op. """ def symbolic_pythonop(g, n, args, *kwargs): # Print information: print('arguments of ', kwargs['name'], ':') print('original node: ', n) for i, out in enumerate(n.outputs()): print('original output {}: {}, requires grad: {}'.format(i, out, out.requiresGrad())) import torch.onnx.symbolic_helper as sym_helper for i, arg in enumerate(args): print('arg {}: {}, requires grad: {}'.format(i, arg, arg.requiresGrad() if sym_helper._is_value(arg) else False)) for k, v in kwargs.items(): print('key: ', k, ' v: ', v) # TODO: all inputs (tensors and scalars) are in args. # backend can define CustomDomain::PythonOp and how info are stored however it deem fit. return g.op("CustomDomain::PythonOp", args[0], name_s=kwargs['name']) torch.onnx.register_custom_op_symbolic("::prim_PythonOp", symbolic_pythonop, 9) # Define input. x = torch.tensor([[0.3971, 0.7544], [0.5695, 0.4388]], requires_grad=True) model = MyModule() # Forward. y = model(x) torch.onnx.export(model, (x,), 'model.onnx', opset_version=12, verbose=True) ``` Test Plan: Imported from OSS Reviewed By: malfet Differential Revision: D28393528 Pulled By: SplitInfinity fbshipit-source-id: e0d55b7c737c5916fda08a3b26b3306037f970df Co-authored-by: BowenBao <bowbao@microsoft.com>	2021-05-13 13:42:49 -07:00

1 2

98 Commits