pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
BowenBao	3f9c803fe8	[ONNX] Redesign onnx pass to enable shape type dependent pattern conversion - cont (#51795 ) (#53304 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53304 With the introduction of ONNX shape inference, shape and type are inferred on the fly as operators get converted from ATen to ONNX when running symbolic function. This resolves the shape/type requirement for the symbolic functions. The pre-onnx passes however, can not be supported by shape inference, since at that stage the operators in the graph are still ATen operators. This PR is to update the design of ONNX pass, to enable a mechanism of capturing subgraphs of ATen operators of certain patterns, and convert them later, when shape/type information of upstream operators are available. The new design will require pre-onnx passes that need shape/type to be written in two parts, encapsulation and conversion. The encapsulation part will find the nodes of patterns, like how pre-onnx passes were written previously. But instead of converting the nodes, it will encapsulate them into a sub-block of a new placeholder node. This part is called before onnx pass, so it runs before calling symbolic functions. The conversion part will be called inside the onnx pass. In onnx pass, run_symbolic_func will be called for each node in topological order. When it reaches the placeholder node, the conversion part will be invoked. It will convert the nodes inside the sub-block based on pattern. By that time, it will have shape/type of upstream operators available. After the conversion is complete, the placeholder node will be removed, and nodes inside its sub-block converted. Run_symbolic_func will be called for these nodes, and they will be converted from ATen operator to ONNX operator. This PR includes several other fixes, listed below. * ~~replace helper.cpp with onnx_utils.cpp for holding utility functions.~~ * fix EraseNumberTypes on Bool type, the code was outdated that back then Bool type doesn't exist. * ~~enable onnx shape inference in export with parameter/initializer data.~~ * other code clean ups. * fix insertion of identity nodes for loop opset 13 sequence output. ~~PR depends on #51603~~ Test Plan: Imported from OSS Reviewed By: SplitInfinity Differential Revision: D26922417 Pulled By: malfet fbshipit-source-id: 14ed06158d539e2451c2e5e63ba1b32fb0f75095	2021-03-11 10:30:09 -08:00
Michael Suo	b4d8f4af82	[package] implement `get_resource_reader` API (#51674 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51674 See https://docs.python.org/3/library/importlib.html#importlib.abc.ResourceReader Test Plan: Imported from OSS Reviewed By: zdevito Differential Revision: D26237034 Pulled By: suo fbshipit-source-id: 4c19f6172d16b710737528d3de48372873b9368d	2021-03-10 12:11:11 -08:00
Raghavan Raman	d3cde6c23c	[NNC] Implementation for aten::cat without conditionals. (#53128 ) Summary: This PR adds an implementation for `aten::cat` in NNC without any conditionals. This version is not enabled by default. Here is the performance of some micro benchmarks with and without conditionals. There is up to 50% improvement in performance without conditionals for some of the shapes. aten::cat implementation in NNC with conditionals ``` $ python -m benchmarks.tensorexpr --device cpu --mode fwd --jit_mode trace --cpu_fusion concat pt: concat2d2input_fwd_cpu_1_160_1_14_1: 5.44 us, SOL 0.26 GB/s, algorithmic 0.51 GB/s pt: concat2d2input_fwd_cpu_1_580_1_174_1: 5.75 us, SOL 1.05 GB/s, algorithmic 2.10 GB/s pt: concat2d2input_fwd_cpu_20_160_20_14_1: 6.87 us, SOL 4.05 GB/s, algorithmic 8.11 GB/s pt: concat2d2input_fwd_cpu_20_580_20_174_1: 14.52 us, SOL 8.31 GB/s, algorithmic 16.62 GB/s pt: concat2d2input_fwd_cpu_8_512_8_512_1: 9.58 us, SOL 6.84 GB/s, algorithmic 13.68 GB/s ``` aten::cat implementation in NNC without conditionals ``` $ python -m benchmarks.tensorexpr --device cpu --mode fwd --jit_mode trace --cpu_fusion --cat_wo_conditionals concat pt: concat2d2input_fwd_cpu_1_160_1_14_1: 4.67 us, SOL 0.30 GB/s, algorithmic 0.60 GB/s pt: concat2d2input_fwd_cpu_1_580_1_174_1: 5.65 us, SOL 1.07 GB/s, algorithmic 2.14 GB/s pt: concat2d2input_fwd_cpu_20_160_20_14_1: 6.10 us, SOL 4.56 GB/s, algorithmic 9.12 GB/s pt: concat2d2input_fwd_cpu_20_580_20_174_1: 7.44 us, SOL 16.22 GB/s, algorithmic 32.44 GB/s pt: concat2d2input_fwd_cpu_8_512_8_512_1: 6.46 us, SOL 10.14 GB/s, algorithmic 20.29 GB/s ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/53128 Reviewed By: bertmaher Differential Revision: D26758613 Pulled By: navahgar fbshipit-source-id: 00f56b7da630b42bc6e7ddd4444bae0cf3a5780a	2021-03-07 22:57:02 -08:00
Bram Wasti	56f8379802	[static runtime] Move all heavy constructor logic into InferenceModule (renamed to StaticModule) (#51564 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51564 Constructor logic was spread throughout InferenceModule and StaticRuntime. This diff unifies the two. After a lot of discussion on this diff D25961626 it became apparent that `clone` is uglier than a cheap StaticRuntime. This means StaticRuntime is effectively StaticModule and the only code in the new StaticRuntime is the `run` functions. ``` graph, schema = PrepareForStaticModule(torchscript_module) sm = StaticModule(graph, schema, options) sm(inputs) // or create many cheap runtimes with the module sr = StaticRuntime(sm) sr(inputs) ``` Changelist: - Rename InferenceModule StaticModule - Move all logic for construction into StaticModule - Create a new StaticRuntime that only has a unique memory planner (everything else is in StaticModule) - Update comments with explanation - Propagate all changes to predictor integration - Propagate all changes to python integration - Change semantics to be a bit more PyTorch-standard (no "run" calls, no "get_" getters). Test Plan: buck test //caffe2/test:static_runtime buck test caffe2/benchmarks/static_runtime:static_runtime_cpptest Reviewed By: hlu1 Differential Revision: D25592967 fbshipit-source-id: 8233bed03137ce129137af2d44bce0095033ef0f	2021-03-05 10:15:26 -08:00
Elias Ellison	bfae3789ba	Move conv to mkldnn (#51483 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51483 This PR moves the conv weights of a frozen model to MKLDNN, and AOT reorders the weights. When the weights are already in MKLDNN, just computing a single conv by converting the input and output from/to mkldnn provides large speedups. I benchmark'd the results of the top 200 shapes in predictor [here](https://www.internalfb.com/phabricator/paste/view/P171537938), as well as verified that it sped up popular models in torchvision. Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D26696703 Pulled By: eellison fbshipit-source-id: 0b4441bee4f6e0890a4540fbca3bb5e58b8c5adf	2021-03-01 21:19:27 -08:00
jiej	4d94ee566e	Ge v1 (#52136 ) Summary: This is a second attempt to use graph executor to run forward on a gradient. This allows a secondary chance to profile intermediate tensor introduced by autodiff. Pull Request resolved: https://github.com/pytorch/pytorch/pull/52136 Reviewed By: pbelevich Differential Revision: D26693978 Pulled By: Krovatkin fbshipit-source-id: 91dde8009a210950af8e5173668ada241e16dd52	2021-02-28 00:53:13 -08:00
Meghan Lele	1d6bd15790	[JIT] Add torch._C._jit submodule (#52910 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52910 Summary PR #52158 tried to move all JIT bindings from `torch._C` to a new submodule `torch._C._jit`, but that...did not go well. This pull request adds the new `torch._C._jit` submodule, but does not migrate the existing bindings. Instead, it adds a unit test that fails if any new bindings are added to `torch._C`. A comment in the test instructs developers to add their new binding to the allowlist if it really should be in `torch._C`, or to add it to the appropriate submodule (e.g `torch._C._jit`, for example). The idea is to prevent the issue described in #51691 from getting worse if it cannot be fixed. Test Plan Continuous integration. Fixes This commit fixes #51691. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D26698373 Pulled By: SplitInfinity fbshipit-source-id: ec9f5426051227a513d4fd09512b624420e0100b	2021-02-26 16:05:05 -08:00
Lillian Johnson	b72a72a477	torch.Package extend PyTorchStreamWriter to track written records (#52218 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52218 Test Plan: Imported from OSS Reviewed By: suo Differential Revision: D26429794 Pulled By: Lilyjjo fbshipit-source-id: 5f68e7991c673ada629d0370c705520243d0637a	2021-02-22 15:02:41 -08:00
Zachary DeVito	60518d10f6	[deploy] torch::deploy API (#51754 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51754 This API allows you to manage multiple python interpreters in a single process to deploy PyTorch models packaged with torch.package. torch/csrc/deploy/deploy.h contains the API definition torch/csrc/deploy/test_deploy.cpp has some examples. Notes: * mutex is added to PyTorchStreamReader to make it safe to use from multiple threads at once. * USE_DEPLOY is only true for the special libtorch_deployinterpreter.so library, when enabled we use a hash table to maintain PyObject <> at::Tensor mappping rather than the internal pointer in Tensor since >1 interpreter may have a reference to the tensor. * serialization.py has some additional functions for creating pickle objects but keeping storages in memory for use transfering tensors between interpreters Test Plan: Imported from OSS Reviewed By: wconstab Differential Revision: D26329468 Pulled By: zdevito fbshipit-source-id: d75f4ebb9a27f1d911179d9996041bcb3ca04a07	2021-02-18 02:30:08 -08:00
Michael Suo	c357f8b826	[package] make torch.package produce unified format (#51826 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51826 Looks like this: ``` resnet.pt ├── .data # Data folder named so it can't clash with torch.package codemodules. │ │ # Names/extensions automatically added to avoid namingconflicts. │ ├── 94286146172688.storage # tensor data │ ├── 94286146172784.storage │ ├── extern_modules # torch.package metadata │ ├── version # version metadata │ └── ... ├── model # package pickled model created w/ │ │ # exporter.save_pickel('model','model.pkl', resnet_model) │ └── model.pkl └── torchvision # all code dependencies for packaged picked └── models # models are captured as source files ├── resnet.py └── utils.py ``` Since `version` is hardcoded in our zip reader/writer implementation, add it as an option that defaults to "version" but accepts other locations for putting the version metadata. Test Plan: Imported from OSS Reviewed By: zdevito Differential Revision: D26295649 Pulled By: suo fbshipit-source-id: 2d75feeb7de0f78196b4d0b6e2b814a7d58bd1dd	2021-02-09 07:45:59 -08:00
Rohan Varma	c941730b96	[JIT/Futures] support set_exception api (#50983 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50983 There is currently no way to handle/propagate errors with the python-based futures API (they are raised correctly if set with an error, but this is only possible from C++). This diff allows the Future's `unwrap_func` to be set in python optionally, so users can set futures completed with an exception and the error will throw as expected. This is mostly to support the following use case in the next diff: ``` ret_fut = torch.futures.Future(unwrap_func = lambda python_result: { # throw exception if needed if isinstance(python_result, Exception): throw python_result }) rpc_fut = rpc.rpc_async(...) # RPC future that times out # Goal is to propagate RPC error to this future rpc_fut.add_done_callback( res => { # Note that ret_fut.set_result(res.wait()) won't propagate the error try: ret_fut.set_result(res.wait()) except Exception as e: ret_fut.set_result(e) } ) ``` ghstack-source-id: 121021434 Test Plan: unittest ``` buck test mode/dev-nosan mode/no-gpu //caffe2/test:futures -- te st_unwrap --print-passing-details ``` Reviewed By: mrshenli Differential Revision: D25950304 fbshipit-source-id: 7ee61e98fcd783b3f515706fa141d538e6d2174d	2021-02-04 20:22:19 -08:00
Meghan Lele	88baf470d1	[JIT] Provide more info when attribute fails to convert (#50870 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50870 Summary Module attributes whose types cannot be determined based on annotations or inference based on their values at script time are added to the concrete type of the corresponding module as "failed attributes". Any attempt to access them in scripted code produces an error with a message explaining that the attribute could not be contributed to a corresponding attribute on the TorchScript module. However, this error is not more specific than that. This commit modifies `infer_type` in `_recursive.py` so that it returns `c10::InferredType` instead, which allows more information about typing failures to be communicated to the caller through the `reason()` method on this class. This information is appended to the hint added to the module concrete type for failed attributes. Testing This commit adds a unit test to `test_module_containers.py` that checks that extra information is provided about the reason for the failure when a module attribute consisting of a list of `torch.nn.Module` fails to convert. Test Plan: Imported from OSS Reviewed By: pbelevich Differential Revision: D26091472 Pulled By: SplitInfinity fbshipit-source-id: fcad6588b937520f250587f3d9e005662eb9af0d	2021-01-27 20:37:10 -08:00
BowenBao	1c9347c666	[ONNX] Use parameter values in onnx shape inference (#49706 ) (#50905 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50905 Adds an additional run of onnx shape inference after constant folding, since initializer may have changed and affected shape inference. Test Plan: Imported from OSS Reviewed By: pbelevich Differential Revision: D26050881 Pulled By: SplitInfinity fbshipit-source-id: 9e5d69c52b647133cd3a0781988e2ad1d1a9c09d	2021-01-27 17:45:32 -08:00
neginraoof	137f2a385a	[ONNX] Handle sequence output for models (#50599 ) Summary: Duplicate of https://github.com/pytorch/pytorch/issues/46542 Pull Request resolved: https://github.com/pytorch/pytorch/pull/50599 Reviewed By: SplitInfinity Differential Revision: D25928897 Pulled By: bzinodev fbshipit-source-id: a898cef7b2d15a287aedd9798ce1423cebf378d4	2021-01-21 15:36:41 -08:00
Brian Vaughan	a9db2f8e7a	Revert D24924236: [pytorch][PR] [ONNX] Handle sequence output shape and type inference Test Plan: revert-hammer Differential Revision: D24924236 (`adc65e7c8d`) Original commit changeset: 506e70a38cfe fbshipit-source-id: 78069a33fb3df825af1cb482da06a07f7b26ab48	2021-01-15 05:58:35 -08:00
Negin Raoof	adc65e7c8d	[ONNX] Handle sequence output shape and type inference (#46542 ) Summary: Handle sequence output shape and type inference. This PR fixes value type of sequence outputs. Prior to this, all model sequence type outputs were unfolded for ONNX models. This PR also enable shape inference for sequence outputs to represent the dynamic shape of these values. Pull Request resolved: https://github.com/pytorch/pytorch/pull/46542 Reviewed By: ezyang Differential Revision: D24924236 Pulled By: bzinodev fbshipit-source-id: 506e70a38cfe31069191d7f40fc6375239c6aafe	2021-01-14 21:12:35 -08:00
Mikhail Zolotukhin	e9dc8fc162	[TensorExpr] Add python bindings. (#49698 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49698 Reincarnation of #47620 by jamesr66a. It's just an initial bunch of things that we're exposing to python, more is expected to come in future. Some things can probably be done better, but I'm putting this out anyway, since some other people were interested in using and/or developing this. Differential Revision: D25668694 Test Plan: Imported from OSS Reviewed By: bertmaher Pulled By: ZolotukhinM fbshipit-source-id: fb0fd1b31e851ef9ab724686b9ac2d172fa4905a	2021-01-14 21:02:47 -08:00
Spandan Tiwari	aeefe2ce31	[ONNX] ONNX dev branch merge 01-06-2021 (#50163 ) Summary: [ONNX] ONNX dev branch merge 01-06-2021 - [ONNX] Support onnx if/loop sequence output in opset 13 - (https://github.com/pytorch/pytorch/issues/49270) - Symbolic function for torch.square (https://github.com/pytorch/pytorch/issues/49446) - [ONNX] Add checks in ONNXSetDynamicInputShape (https://github.com/pytorch/pytorch/issues/49783) … - [ONNX] Enable export af aten::__derive_index (https://github.com/pytorch/pytorch/issues/49514) … - [ONNX] Update symbolic for unfold (https://github.com/pytorch/pytorch/issues/49378) … - [ONNX] Update the sequence of initializers in exported graph so that it is as same as inputs. (https://github.com/pytorch/pytorch/issues/49798) - [ONNX] Enable opset 13 ops (https://github.com/pytorch/pytorch/issues/49612) … - [ONNX] Improve error message for supported model input types in ONNX export API. (https://github.com/pytorch/pytorch/issues/50119) - [ONNX] Add a post-pass for If folding (https://github.com/pytorch/pytorch/issues/49410) Pull Request resolved: https://github.com/pytorch/pytorch/pull/50163 Reviewed By: pbelevich Differential Revision: D25821059 Pulled By: SplitInfinity fbshipit-source-id: 9f511a93d9d5812d0ab0a49d61ed0fa5f8066948	2021-01-13 13:51:21 -08:00
Elias Ellison	a389b30bfc	Add Post Freezing Optimizations, turn on by default in torch.jit.freeze (#50222 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50222 This PR adds a pass which runs a set of optimizations to be done after freezing. Currently this encompasses Conv-BN folding, Conv->Add/Sub/Mul/Div folding and i'm also planning on adding dropout removal. I would like some feedback on the API. torch.jit.freeze is technically in \~prototype\~ phase so we have some leeway around making changes. I think in the majority of cases, the user is going to want to freeze their model, and then run in inference. I would prefer if the optimization was opt-out instead of opt-in. All internal/framework use cases of freezing all use `freeze_module`, not the python API, so this shouldn't break anything. I have separated out the optimization pass as a separate API to make things potentially modular, even though I suspect that is an unlikely case. In a future PR i would like to add a `torch::jit::freeze` which follows the same api as `torch.jit.freeze` intended for C++ use, and runs the optimizations. Test Plan: Imported from OSS Reviewed By: tugsbayasgalan Differential Revision: D25856264 Pulled By: eellison fbshipit-source-id: 56be1f12cfc459b4c4421d4dfdedff8b9ac77112	2021-01-12 11:39:13 -08:00
Elias Ellison	6971149326	[JIT] Add Frozen Conv-> Add/Sub/Mul/Div fusion (#50075 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50075 Adds Conv - Add/Sub/Mul/Div fusion for frozen models. This helps cover models like torchvision maskrcnn, which use a hand-rolled batchnorm implementation: `90645ccd0e/torchvision/ops/misc.py (L45)`. I haven't tested results yet but I would expect a somewhat similar speed up as conv-bn fusion (maybe a little less). Test Plan: Imported from OSS Reviewed By: tugsbayasgalan Differential Revision: D25856265 Pulled By: eellison fbshipit-source-id: 2c36fb831a841936fe4446ed440185f59110bf68	2021-01-12 11:39:02 -08:00
Elias Ellison	035229c945	[JIT] Frozen Graph Conv-BN fusion (#50074 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50074 Adds Conv-BN fusion for models that have been frozen. I haven't explicitly tested perf yet but it should be equivalent to the results from Chillee's PR [here](https://github.com/pytorch/pytorch/pull/476570) and [here](https://github.com/pytorch/pytorch/pull/47657#issuecomment-725752765). Click on the PR for details but it's a good speed up. In a later PR in the stack I plan on making this optimization on by default as part of `torch.jit.freeze`. I will also in a later PR add a peephole so that there is not conv->batchnorm2d doesn't generate a conditional checking # dims. Zino was working on freezing and left the team, so not really sure who should be reviewing this, but I dont care too much so long as I get a review � Test Plan: Imported from OSS Reviewed By: tugsbayasgalan Differential Revision: D25856261 Pulled By: eellison fbshipit-source-id: da58c4ad97506a09a5c3a15e41aa92bdd7e9a197	2021-01-12 11:37:32 -08:00
Chen Lai	717f31d984	Remove unused reconstruct_scopes function (#48822 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48822 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D25325012 Pulled By: cccclai fbshipit-source-id: 86ea4c0b2926257c0f82aa05cbcd83278b1b67f7	2020-12-11 23:43:36 -08:00
neginraoof	15bc21c280	[ONNX] Track and list model params for scripting (#47348 ) Summary: List model parameters as inputs following freezing script module. Pull Request resolved: https://github.com/pytorch/pytorch/pull/47348 Reviewed By: heitorschueroff Differential Revision: D25309756 Pulled By: bzinodev fbshipit-source-id: cbe679ece934d5e6c418a22f08c1662256914c4c	2020-12-03 23:07:28 -08:00
Meghan Lele	18eccfbe42	[JIT] Fix clang-tidy warnings in jit/python (#47985 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47985 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D25258644 Pulled By: SplitInfinity fbshipit-source-id: dfc15dc62c148f79f4e99fd058a6bf2d071ccbb5	2020-12-02 12:35:36 -08:00
Bram Wasti	43a9d6fb6e	[TorchScript] Support user defined classes as constants (#5062 ) Summary: Pull Request resolved: https://github.com/pytorch/glow/pull/5062 Pull Request resolved: https://github.com/pytorch/pytorch/pull/45556 User defined classes can be used as constants. This is useful when freezing and removing the module from the graph. Test Plan: waitforsadcastle Reviewed By: eellison Differential Revision: D23994974 fbshipit-source-id: 5b4a5c91158aa7f22df39d71f2658afce1d29317	2020-11-16 20:52:02 -08:00
Zino Benaissa	11710598db	Preserve module parameters in freezing (#47094 ) Summary: Added preserveParameters to freezing API that allows to preserve module parameters. Fixes #{39613} Pull Request resolved: https://github.com/pytorch/pytorch/pull/47094 Reviewed By: eellison Differential Revision: D24792867 Pulled By: bzinodev fbshipit-source-id: f0cd980f5aed617b778afe2f231067c7c30a1527	2020-11-13 20:18:32 -08:00
generatedunixname89002005325676	8855c4e12f	[AutoAccept][Codemod][FBSourceClangFormatLinter] Daily `arc lint --take CLANGFORMAT` Differential Revision: D24946660 fbshipit-source-id: e47d04cac21314acb7f9ac3bdfa0d09289e399b4	2020-11-13 06:59:04 -08:00
Elias Ellison	fe81faee5f	Add more CPU tests (#47369 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47369 Test Plan: Imported from OSS Reviewed By: ansley Differential Revision: D24805251 Pulled By: eellison fbshipit-source-id: f1a8210ffdc3cc88354cb4896652151d83a0345a	2020-11-12 11:13:47 -08:00
Elias Ellison	f221a19a7f	Force LLVM Compilation for CPU Tests (#46949 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46949 Test Plan: Imported from OSS Reviewed By: ansley Differential Revision: D24805247 Pulled By: eellison fbshipit-source-id: 4fcaf02d8a78cc5cbcbde36940d0a2c85fba3fc5	2020-11-12 11:12:08 -08:00
jiej	ac146c4820	[nvFuser] Switching to `CudaFusionGuard` from `BailOut` for nvfuser - update 2 (#46452 ) Summary: 1. Added CudaFusionGuard as the custom TypeCheck for nvfuser; enabled dynamic shape support with profiling executor; 2. dropped support for legacy fuser; 3. re-enabled nvfuser tests; 4. added registration for profiling record to allow profiling on user specified nodes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/46452 Reviewed By: zou3519, anjali411 Differential Revision: D24364642 Pulled By: ngimel fbshipit-source-id: daf53a9a6b6636e1ede420a3a6d0397d4a8b450b	2020-10-19 15:44:31 -07:00
Tao Xu	495070b388	[Metal] Add the Python binding for optimize_for_mobile (#46456 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46456 Add the python binding in CMake. The general workflow is - Build pytorch - `USE_PYTORCH_METAL=ON python setup.py install --cmake` - Run optimize_for_mobile ``` import torch from torch.utils.mobile_optimizer import optimize_for_mobile scripted_model = torch.jit.load('./mobilenetv2.pt') optimized_model = optimize_for_mobile(scripted_model, backend='metal') torch.jit.export_opnames(optimized_model) torch.jit.save(optimized_model, './mobilenetv2_metal.bc') ``` The exported ops are ``` ['aten::adaptive_avg_pool2d', 'aten::add.Tensor', 'aten::addmm', 'aten::reshape', 'aten::size.int', 'metal::copy_to_host', 'metal_prepack::conv2d_run'] ``` ghstack-source-id: 114559878 Test Plan: - Sandcastle CI - Circle CI Reviewed By: kimishpatel Differential Revision: D24356768 fbshipit-source-id: fb5c4c4b6316347b67edb4132da044a81470ddfd	2020-10-17 10:26:25 -07:00
Brian Hirsh	a3caa719af	fix #45552 - adding add_done_callback(fn) to torch.futures.Future (#45675 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45675 Test Plan: Imported from OSS Reviewed By: glaringlee Differential Revision: D24055353 Pulled By: bdhirsh fbshipit-source-id: 9233c8e17acc878f0fecbe740a4397fb55cf722f	2020-10-13 07:47:36 -07:00
Elias Ellison	1b97ffa07a	[1/3] [JIT] Make sure fusion occurs in test_tensorexpr file (#45788 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45788 We were only running the traced graph once, which would not yet have been fused at that point. We should run for num_profiled_runs + 1, and also assert that all nodes in the graph were fused. Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D24169537 Pulled By: eellison fbshipit-source-id: 8499bb1a5bd9d2221b1f1c54d6352558cf07ba9a	2020-10-08 12:02:57 -07:00
BowenBao	3da4cea658	[ONNX] Add dim_param support in export with onnx shape inference (#44920 ) Summary: * Support propagating `dim_param` in ONNX by encoding as `ShapeSymbol` in `SymbolicShape` of outputs. If export is called with `dynamic_axes` provided, shape inference will start with these axes set as dynamic. * Add new test file `test_pytorch_onnx_shape_inference.py`, reusing all test cases from `test_pytorch_onnx_onnxruntime.py`, but focus on validating shape for all nodes in graph. Currently this is not enabled in the CI, since there are still quite some existing issues and corner cases to fix. The test is default to run only at opset 12. * Bug fixes, such as div, _len, and peephole.cpp passes for PackPadded, and LogSoftmaxCrossEntropy. * This PR depends on existing PR such as 44332. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44920 Reviewed By: eellison Differential Revision: D23958398 Pulled By: bzinodev fbshipit-source-id: 00479d9bd19c867d526769a15ba97ec16d56e51d	2020-09-30 21:56:24 -07:00
Negin Raoof	6b42ca2d69	[ONNX] Update embedding_bag export (#44693 ) Summary: Export of embedding bag with dynamic list of offsets. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44693 Reviewed By: malfet Differential Revision: D23831980 Pulled By: bzinodev fbshipit-source-id: 3eaff1a0f20d1bcfb8039e518d78c491be381e1a	2020-09-30 13:36:40 -07:00
Jerry Zhang	f575df201f	[quant][graphmode][jit][api] Expose preserved_attrs from finalize to convert_jit (#44490 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44490 Test Plan: Imported from OSS Reviewed By: z-a-f Differential Revision: D23631142 fbshipit-source-id: f0913f0cb4576067e2a7288326024942d12e0ae0	2020-09-22 19:37:25 -07:00
Ivan Kobzarev	e9941a5dd4	[vulkan][py] torch.utils.optimize_for_vulkan (#44903 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44903 Test Plan: Imported from OSS Reviewed By: kimishpatel Differential Revision: D23766039 Pulled By: IvanKobzarev fbshipit-source-id: dbdf484ee7d3a7719aab105efba51b92ebc51568	2020-09-18 18:20:11 -07:00
Shawn Wu	572f7e069c	Enable type check for torch.testing._internal.te_utils.* (#44927 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44927 Test Plan: Imported from OSS Reviewed By: walterddr Differential Revision: D23776842 Pulled By: sshawnwu fbshipit-source-id: 65c028169a37e1f2f7d9fdce8a958234ee1caa26	2020-09-18 18:09:15 -07:00
Michael Suo	374e9373b5	[jit] Pull (most) tests out of libtorch_python (#44795 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44795 Today, we build our cpp tests twice, once as a standalone gtest binary, and once linked in `libtorch_python` so we can call them from `test_jit.py`. This is convenient (it means that `test_jit.py` is a single entry point for all our tests), but has a few drawbacks: 1. We can't actually use the gtest APIs, since we don't link gtest into `libtorch_python`. We're stuck with the subset that we want to write polyfills for, and an awkward registration scheme where you have to write a test then include it in `tests.h`). 2. More seriously, we register custom operators and classes in these tests. In a world where we may be linking many `libtorch_python`s, this has a tendency to cause errors with `libtorch`. So now, only tests that explicitly require cooperation with Python are built into `libtorch_python`. The rest are built into `build/bin/test_jit`. There are tests which require that we define custom classes and operators. In these cases, I've built thm into separate `.so`s that we call `torch.ops.load_library()` on. Test Plan: Imported from OSS Reviewed By: SplitInfinity, ZolotukhinM Differential Revision: D23735520 Pulled By: suo fbshipit-source-id: d146bf4e7eb908afa6f96b394e4d395d63ad72ff	2020-09-18 14:04:40 -07:00
Mikhail Zolotukhin	c6febc6480	[JIT] Add a python hook for a function to interpret JIT graphs. (#44493 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44493 This function allows to execute a graph exactly as it is, without going through a graph executor which would run passes on the graph before interpreting it. I found this feature extremely helpful when I worked on a stress-testing script to shake out bugs from the TE fuser: I needed to execute a very specific set of passes on a graph and nothing else, and then execute exactly it. Test Plan: Imported from OSS Reviewed By: jamesr66a Differential Revision: D23632505 Pulled By: ZolotukhinM fbshipit-source-id: ea81fc838933743e2057312d3156b77284d832ef	2020-09-11 02:55:26 -07:00
neginraoof	3d7c22a2ce	[ONNX] Enable new scripting passes for functionalization and remove_mutation (#43791 ) Summary: Duplicate of https://github.com/pytorch/pytorch/issues/41413 This PR initiates the process of updating the torchsciprt backend interface used by ONNX exporter. Replace jit lower graph pass by freeze module pass Enable ScriptModule tests for ONNX operator tests (ORT backend) and model tests by default. Replace jit remove_inplace_ops pass with remove_mutation and consolidation all passes for handling inplace ops. Pull Request resolved: https://github.com/pytorch/pytorch/pull/43791 Reviewed By: houseroad Differential Revision: D23421872 Pulled By: bzinodev fbshipit-source-id: a98710c45ee905748ec58385e2a232de2486331b	2020-09-04 15:21:45 -07:00
Bert Maher	98ad5ff41f	[te] Disable reductions by default (#44122 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44122 Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D23504769 Pulled By: bertmaher fbshipit-source-id: 1889217cd22da529e46ab30c9319a5646267e4ec	2020-09-03 23:37:45 -07:00
Lu Fang	f15e27265f	[torch.fx] Add support for custom op (#43248 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43248 We add the support of __torch_function__ override for C++ custom op. The logic is the same as the other components, like torch.nn.Module. Refactored some code a little bit to make it reusable. Test Plan: buck test //caffe2/test:fx -- test_torch_custom_ops Reviewed By: bradleyhd Differential Revision: D23203204 fbshipit-source-id: c462a86e407e46c777171da32d7a40860acf061e	2020-09-02 16:08:37 -07:00
BowenBao	08126c9153	[ONNX] Utilize ONNX shape inference for ONNX exporter (#40628 ) Summary: It is often that the conversion from torch operator to onnx operator requires input rank/dtype/shape to be known. Previously, the conversion depends on tracer to provide these info, leaving a gap in conversion of scripted modules. We are extending the export with support from onnx shape inference. If enabled, onnx shape inference will be called whenever an onnx node is created. This is the first PR introducing the initial look of the feature. More and more cases will be supported following this PR. * Added pass to run onnx shape inference on a given node. The node has to have namespace `onnx`. * Moved helper functions from `export.cpp` to a common place for re-use. * This feature is currently experimental, and can be turned on through flag `onnx_shape_inference` in internal api `torch.onnx._export`. * Currently skipping ONNX Sequence ops, If/Loop and ConstantOfShape due to limitations. Support will be added in the future. Pull Request resolved: https://github.com/pytorch/pytorch/pull/40628 Reviewed By: mrshenli Differential Revision: D22709746 Pulled By: bzinodev fbshipit-source-id: b52aeeae00667e66e0b0c1144022f7af9a8b2948	2020-08-30 18:35:46 -07:00
Protonu Basu	58a7e73a95	[TensorExpr] Block Codegen (#40054 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40054 Reviewed By: ZolotukhinM Differential Revision: D22061350 Pulled By: protonu fbshipit-source-id: 004f7c316629b16610ecdbb97e43036c72c65067	2020-08-28 09:53:42 -07:00
Zino Benaissa	abe878ce96	Allow Freezing of Module containing interface attribute (#41860 ) Summary: This patch allows to freeze model that utilizes interfaces. Freezing works under the user assumption that the interfase module dones not aliases with any value used in the model. To enable freezing of such modules, added an extra pramater: torch._C._freeze_module(module, ignoreInterfaces = True) Pull Request resolved: https://github.com/pytorch/pytorch/pull/41860 Reviewed By: eellison Differential Revision: D22670566 Pulled By: bzinodev fbshipit-source-id: 41197a724bc2dca2e8495a0924c224dc569f62a4	2020-08-21 18:57:13 -07:00
taivu	665da61d2b	Replace Conv1d with Conv2d (#42867 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42867 Test Plan: Imported from OSS Reviewed By: kimishpatel Differential Revision: D23177916 Pulled By: kimishpatel fbshipit-source-id: 68cc40cf42d03e5b8432dc08f9933a4409c76e25	2020-08-20 21:36:51 -07:00
Sinan Nasir	6e1127ea3f	[NCCL] Changed FutureNCCL's then callback logic for better efficiency. (#42869 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42869 We realized that when we invoke a simple callback that divides the tensors by `world_size` after `allreduce`, the performance was almost 50% lower in terms of QPS compared to the case where a simple `allreduce` hook is used with no `then` callback. The main problem was as we call `work.wait()` before invoking `then` callback, we were synchronizing `work`'s stream with the default PyTorch stream inside [`runHook`](https://github.com/pytorch/pytorch/blob/master/torch/csrc/distributed/c10d/reducer.cpp#L609) and stalling the backward computation. In that PR, we ensure that FutureNCCL's `then` callback is not stalling the backward computation. Assuming single-process single-device, `FutureNCCL` gets a new stream from device's pool using `at::cuda::getStreamFromPool` to run `callback` and before invoking the `callback` inline it synchronizes `WorkNCCL`'s stream by callback's stream not the default stream. ghstack-source-id: 110208431 Test Plan: Run performance benchmark tests to validate performance issue is resolved. Also, `python test/distributed/test_c10d.py` to avoid any odd issues. Reviewed By: pritamdamania87 Differential Revision: D23055807 fbshipit-source-id: 60e50993f1ed97497514eac5cb1018579ed2a4c5	2020-08-19 19:42:22 -07:00
taivu	02c8ad70f2	Reconstruct scopes (#41615 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41615 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D22611331 Pulled By: taivu1998 fbshipit-source-id: d4ed4cf6360bc1f72ac9fa24bb4fcf6b7d9e7576	2020-08-13 22:38:16 -07:00
Bram Wasti	ada8404f2d	[jit] Scaffold a static runtime (#42753 ) Summary: The premise of this approach is that a small subset of neural networks are well represented by a data flow graph. The README contains more information. The name is subject to change, but I thought it was a cute reference to fire. suo let me know if you'd prefer this in a different spot. Since it lowers a JIT'd module directly I assumed the JIT folder would be appropriate. There is no exposed Python interface yet (but is mocked up in `test_accelerant.py`) Pull Request resolved: https://github.com/pytorch/pytorch/pull/42753 Reviewed By: zou3519 Differential Revision: D23043771 Pulled By: bwasti fbshipit-source-id: 5353731e3aae31c08b5b49820815da98113eb551	2020-08-12 13:05:27 -07:00

1 2 3

148 Commits