pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Mikhail Zolotukhin	765bf8f03d	Remove duplicate bindings from torch/csrc/jit/python/init.cpp. (#36492 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36492 Test Plan: Imported from OSS Differential Revision: D20995235 Pulled By: ZolotukhinM fbshipit-source-id: 6afa3a956e57c2fb94bb29d332177be73a2bac2a	2020-04-13 12:28:32 -07:00
Kimish Patel	d559a47933	Enable relu fusion with prepacked linear/conv. (#35705 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35705 Introduces a pass for relu fusion. Test Plan: python test/test_xnnpack_integration.py Imported from OSS Differential Revision: D20746592 fbshipit-source-id: 6c22f60a20e9121618c85077b9b58fb8d4082b3b	2020-04-03 15:38:45 -07:00
Mikhail Zolotukhin	af5121f62a	Invoke TensorExpr fuser pass from a graph executor. (#35913 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35913 The pass itself is still disabled by default, but with this change we don't need to register it as a custom pass anymore. It allows us to control its behavior with env variables more easily. Test Plan: Imported from OSS Reviewed By: suo Differential Revision: D20827189 Pulled By: ZolotukhinM fbshipit-source-id: e74d90b5e46422e7ab7bc40974a805220da50fbc	2020-04-03 12:20:26 -07:00
Christian Sarofeen	6d24f8fe21	Infrastructure for a new CUDA Fuser (#34785 ) Summary: Summary: This PR contains the infrastructure of a new CUDA fuser. This CUDA fuser is based on many of the same principles of TensorExpressions and Halide, however the implementation is ground up. The fusion pass itself is similar to the default CUDA fuser, however, it has undergone some refactoring and is using the new code generation infrastructure. For those who are interested in how the code generation in this PR works, I would recommend reviewing _test/cpp/jit/test_gpu_fusion.cpp_ as well as the long comment section at the beginning of _torch/csrc/jit/codegen/cuda/transform_replay.h_ One of the largest differences between our approach and that of TVM/Halide, is the concept of "TensorView". TensorView from a high level should be thought of similarly to how we think of working with Tensors in PyTorch. It's an N-D object which can undergo transformations that change its dimensionality. Dimensionality changes are done through the operations split/merge/reorder/computeAt. These transformations are similar to split/fuse/reorder/compute_at of TVM, they modify how a tensor is iterated over to generate GPU code. Interestingly, in our scheme these transformations are applied to tensors and only impact how that tensor is generated. Warning: This PR is purposefully not feature complete with the current fuser. We wanted to separate out the infrastructure from the fusion capabilities. Once in, smaller incremental PRs will be submitted to expand capabilities of the fuser. Short term goals: Parity with current CUDA fuser (including performance): - Dynamic shapes (no recompilation) - Implicit handling of braodcast (broadcasted tensors are treated as tensors of the braodcasted size in the generated code) - Dropout Mid-term goals: - Transposes fused with pointwise operations where transpose involves only 2 axes (across the fused operation). - 1-D reductions fused with pointwise operations Pull Request resolved: https://github.com/pytorch/pytorch/pull/34785 Reviewed By: ZolotukhinM Differential Revision: D20650977 Pulled By: soumith fbshipit-source-id: ee39c95a880e1b9822e874ed4cc180971572bf63	2020-04-02 09:22:42 -07:00
Supriya Rao	a090de380c	[quant][graph] Add quant fusion for dynamic quantization (#35586 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35586 This pass fuses the choose_qparams-quant-dequant sequence Fusion for weight tensor is the same as static quant. Test Plan: python test/test_quantize_script.py Imported from OSS Differential Revision: D20755680 fbshipit-source-id: b7443770642b6e6fa0fa9da8a44637e9b2d4df70	2020-03-30 23:34:56 -07:00
Supriya Rao	1f7ee7b6b7	[quant][graph] Add pass to insert quant dequant for dynamic quantization (#35448 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35448 Add _choose_qparams_per_tensor which returns scale and zero_point similar to the dynamic quantization in the operator Test Plan: python test/test_quantize_script.py Imported from OSS Differential Revision: D20755679 fbshipit-source-id: c9066d8f1bb3e331809be26c4be806faafc9b981	2020-03-30 23:33:32 -07:00
Jerry Zhang	6fc2403951	[quant][graphmode] qconfig_dict support None (#35336 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35336 Test Plan: python test/test_quantization.py Imported from OSS Differential Revision: D20655302 fbshipit-source-id: b453f3240ac487aa29629953b4d71274dbbc25fc	2020-03-29 12:47:47 -07:00
Nikolay Korovaiko	9e22d15f14	Enable tensorexpr cpp tests in CI. try #2 (#35454 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35454 Differential Revision: D20665160 Pulled By: Krovatkin fbshipit-source-id: e04cbe92b2ee5a3288f3c4e5c83533bfea85bf85	2020-03-27 12:09:55 -07:00
Bram Wasti	a3e10d2a17	Expose enablement of TensorExpr fuser as env variable (#35341 ) Summary: This commit allows one to use an environment variable to enable the fuser in torch/csrc/jit/tensorexpr/ ``` PYTORCH_TENSOREXPR=1 python benchmark.py ``` This commit also changes the registration to happen by default, removing the requirement for the python exposed "_jit_register_tensorexpr_fuser" Pull Request resolved: https://github.com/pytorch/pytorch/pull/35341 Reviewed By: ZolotukhinM Differential Revision: D20676348 Pulled By: bwasti fbshipit-source-id: 4c997cdc310e7567c03905ebff72b3e8a4c2f464	2020-03-26 14:31:57 -07:00
Meghan Lele	6384c2d81b	[JIT] clang-format JIT code (#35115 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35115 This commit runs the newly added tools/clang_format.py on the JIT codebase and includes all of the formatting changes thus produced. Testing: Ran the script, CI. Test Plan: Imported from OSS Reviewed By: eellison Differential Revision: D20568523 Pulled By: SplitInfinity fbshipit-source-id: e09bdb982ccf090eecfb7c7b461b8d0681eef82b	2020-03-26 11:24:51 -07:00
Suraj Menon	aa01a95c6d	Revert D20630760: [pytorch][PR] Enable NNC tests vol. i. add test_tensorexpr.py tests [WIP] Test Plan: revert-hammer Differential Revision: D20630760 Original commit changeset: 7d2f27aca6b1 fbshipit-source-id: 28ac92b3390651a4a67061d6ebf208515b9b9463	2020-03-25 20:34:46 -07:00
Kimish Patel	dc2c4d02f9	Add a wrapper to wrap all optimization for mobile. (#35227 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35227 This wraps. 1. Conv BN folding (not mobile specific) 2. insert XNNPACK conv2d/Linear ops 3. Remove prepacking ops. Test Plan: Imported from OSS Differential Revision: D20603562 fbshipit-source-id: ff373af7112c070ec6198bac51845282e09ff1f8	2020-03-25 19:21:14 -07:00
Nikolay Korovaiko	f3a5081bd4	Enable NNC tests vol. i. add test_tensorexpr.py tests [WIP] (#34897 ) Summary: This PR add tensorexpr cpp tests to test_jit.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/34897 Differential Revision: D20630760 Pulled By: Krovatkin fbshipit-source-id: 7d2f27aca6b1e23e3ffed1c765d8f590688118e3	2020-03-25 17:23:48 -07:00
Elias Ellison	aab4beb87f	[JIT] Pass To Safely Remove Aten Inplace Ops (#33186 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33186 This helps create larger functional graphs. It has the potential to increase memory use, so in order to land this on by default we would probably also do a reuse of buffers pass. This is currently O(n * \| Removed Nodes \| ) because we have to rebuild the alias Db each time we make a change. This pass is critical to creating functional graphs, so this might be a compelling use case to build incremental updates to alias Db. Test Plan: Imported from OSS Differential Revision: D20603189 Pulled By: eellison fbshipit-source-id: 105db52bf38e02188ca6df6d36294466d3309a0a	2020-03-24 23:45:58 -07:00
Elias Ellison	5b2f8cef08	[JIT] Functional Graph Pass (#33020 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33020 This is a pass to create functional blocks. The other PRs in the stack help avoid some of the limitations that are are often found in graphs. It's possible that this would work well with a graph that is frozen. Follow up work items that will help this pass: - We don't currently have any capacity in alias analysis to tell whether a Value that came from the wildcard set "re-escapes" back into the wildcard set. - More comments on the semantics of the graph and correctness conditions - We could consider using dynamic dag if the perf of this is a limitation. - potential make Functional Graphs Functional Blocks instead, so that we do not repeatedly copy constants, also to make IR read easier. Test Plan: Imported from OSS Differential Revision: D20603188 Pulled By: eellison fbshipit-source-id: 6822a6e65f4cc2676f8f6445fe8aa1cb858ebeeb	2020-03-24 23:44:18 -07:00
davidriazati	44622bbda9	[jit] Add lazy script decorator (#34935 ) Summary: Stacked PRs * #34938 - [jit] Remove stray `script` * #34935 - [jit] Add lazy script decorator Some users maintain libraries of code that is largely trace-able but not script-able. However, some functions may need to be `torch.jit.script`ed if they contain control flow so the tracer will use the compiler version. This however impacts library start up time as in #33418, so this PR adds a workaround in the form of a `torch.jit._lazy_script_while_tracing` that will only initialize the compiler if the function is called while actually tracing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34935 Pulled By: driazati Differential Revision: D20569778 fbshipit-source-id: d87c88c02b1abc86b283729ab8db94285d7d4853	2020-03-24 13:43:18 -07:00
Supriya Rao	55019d357e	[quant][graphmode] Add observers for dynamic quant (#35121 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35121 For dynamic quantization we insert observers at the input to mimic the quatization of activations that happens in the operator Observer for weight is inserted similar to static quant Test Plan: python test/test_quantize_script.py Sample output for single layer FC .graph(%self : __torch__.___torch_mangle_4.M, %x.2 : Tensor): %_observer_1 : __torch__.torch.quantization.observer.MinMaxObserver = prim::GetAttr[name="_observer_1"](%self) %x.1 : Tensor = prim::CallMethod[name="forward"](%_observer_1, %x.2) %2 : __torch__.torch.nn.modules.linear.___torch_mangle_5.Linear = prim::GetAttr[name="fc"](%self) %3 : Tensor = prim::CallMethod[name="forward"](%2, %x.1) # test/test_quantize_script.py:19:23 return (%3) graph(%self : __torch__.torch.nn.modules.linear.___torch_mangle_5.Linear, %input.1 : Tensor): %2 : Function = prim::Constant[name="linear"]() %3 : Tensor = prim::GetAttr[name="weight"](%self) %_observer_0 : __torch__.torch.quantization.observer.MinMaxObserver = prim::GetAttr[name="_observer_0"](%self) %7 : Tensor = prim::CallMethod[name="forward"](%_observer_0, %3) %4 : Tensor = prim::GetAttr[name="bias"](%self) %5 : Tensor = prim::CallFunction(%2, %input.1, %7, %4) # /home/supriyar/miniconda3/envs/pytorch_py3/lib/python3.7/site-packages/torch/nn/modules/linear.py:87:15 return (%5) Imported from OSS Differential Revision: D20599144 fbshipit-source-id: 9a8fa0e8655b9908826b981dce8a11d86efce5df	2020-03-24 10:54:16 -07:00
Kimish Patel	e1c092fe3a	Changes to transition to generic API for ops with weight prepacking (#35010 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35010 semantics. This PR moves all the xnnpack specific interfces to a generic interface. Accordingly removes xnnpac specific reference from API and some variable names. What has not yet changed: TODO: USE_XNNPACK is still used. This can be removed where no XNNPACK specific things are done. e.g., RegisterOpContext.cpp and xnnpack_rewrite.cpp. Also the filename and structure also remains. Some of the generic class definition can be moved non-XNNPACK specific folder. Test Plan: python test/test_xnnpack_integration.py Imported from OSS Differential Revision: D20526416 fbshipit-source-id: 2e1725345c44bbb26bdc448097a7384eca121387	2020-03-22 08:31:53 -07:00
Wanchao Liang	c21fde6421	[jit] make jit/rpc share the same PythonFutureWrapper (#35039 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35039 This is the initial step towards merging ivalue future and rpc future Test Plan: Imported from OSS Differential Revision: D20537164 Pulled By: wanchaol fbshipit-source-id: d4f148c88e49ed6b0881ca4b4dd945ea24166183	2020-03-20 22:35:34 -07:00
Mikhail Zolotukhin	12f0052eee	Add TensorExpr Fuser tests (resubmit). (#35085 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35085 Test Plan: Imported from OSS Differential Revision: D20552334 Pulled By: ZolotukhinM fbshipit-source-id: 628fcf4719a879f18978ff8a0a64afbb045df645	2020-03-20 13:19:31 -07:00
Natalia Gimelshein	3c90a90730	Revert D20540599: Add TensorExpr Fuser tests. Test Plan: revert-hammer Differential Revision: D20540599 Original commit changeset: ced9b6657fe7 fbshipit-source-id: e8fa11f20207c35f39b3fbe6f45fc627715377c1	2020-03-19 18:37:32 -07:00
Mikhail Zolotukhin	7b59f41009	Add TensorExpr Fuser tests. (#35052 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35052 Differential Revision: D20540599 Test Plan: Imported from OSS Pulled By: ZolotukhinM fbshipit-source-id: ced9b6657fe72bca61833ab5d59bdaddcacd114b	2020-03-19 14:31:54 -07:00
Mikhail Zolotukhin	95833a49e6	[TensorExpr] Pull changes from bertmaher/pytorch_fusion. (#34842 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34842 This PR (hopefully the last one of such kind) is merging changes from a side branch where tensor expessions based fuser work has been done so far. This PR is is a squashed version of changes in the side branch, which is available here: https://github.com/bertmaher/pytorch Differential Revision: D20478208 Test Plan: Imported from OSS Pulled By: ZolotukhinM fbshipit-source-id: 21556e009f1fd88099944732edba72ac40e9b9c0	2020-03-17 11:02:48 -07:00
Mikhail Zolotukhin	35e7efeb9a	[TensorExpr] Add CUDA codegen. (#34227 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34227 This PR adds a CUDA support to tensor expressions. Differential Revision: D20251836 Test Plan: Imported from OSS Pulled By: ZolotukhinM fbshipit-source-id: ab36a55834cceff30c8371fef6cca1054a32f017	2020-03-16 11:49:29 -07:00
Mikhail Zolotukhin	42b2c8c65d	[TensorExpr] Add a fuser pass based on tensor expressions. (#34226 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34226 LLVM and Cuda backends are added in subsequent PRs, so at this point the fuser is pretty useless, but it still can be tested and its logic is not going to change with addition of the codegens. Differential Revision: D20251838 Test Plan: Imported from OSS Pulled By: ZolotukhinM fbshipit-source-id: 82b0d221fa89904ed526689d02a6c7676a8ce8de	2020-03-16 11:49:24 -07:00
peter	24c9e61e79	Enable JIT tests on Windows (#27029 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27029 Reviewed By: eellison Differential Revision: D20458664 Pulled By: jamesr66a fbshipit-source-id: 22be918543703869f471e89b3478423198351bf3	2020-03-16 11:26:21 -07:00
Kimish Patel	4da5569300	Pass to remove prepacking ops. (#34319 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34319 Removes prepacking ops and install them as attributes of the top level module. Needs to run freezing as the first pass. Test Plan: python test/test_xnnpack_integration.py Imported from OSS Differential Revision: D20290726 fbshipit-source-id: 633ceaa867ff7d5c8e69bd814c0362018394cb3a	2020-03-14 12:53:31 -07:00
Kimish Patel	7dd5da2026	JIT pass to insert XNNPACK ops (#34048 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34048 Rewrites the graph to insert xnnpack prepack and packed run ops for conv2d and linear. Test Plan: python test/test_xnnpack_integration.py Imported from OSS Differential Revision: D20185658 fbshipit-source-id: c4c073c912ad33e822e7beb4ed86c9f895129d55	2020-03-14 12:53:27 -07:00
Zachary DeVito	52005b551c	invokeOperatorFromPython: support overloaded operator calling (#34671 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34671 Like the python arg parser, this tries to convert to the schema in order. It introduces schema_match_exception which gets thrown when the schema doesn't match, allowing the overload handler to try the next option. Behavior will not 100% match the schema argument parser but should work for simple cases using custom binding. Test Plan: Imported from OSS Differential Revision: D20432206 Pulled By: zdevito fbshipit-source-id: 280839a2205ea3497db3a9b5741fccc1e2bff9a8	2020-03-13 18:46:03 -07:00
Michael Suo	af3a7e2b50	[jit] small cleanups after script:: removal (#34677 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34677 1. Remove remaining uses of `script::` namespace from the codebase, 2. Add one more typedef for `script::ExtraFilesMap` which is part of the public interface. Pull Request resolved: #34580 Test Plan: Imported from OSS Reviewed By: zdevito Differential Revision: D20431739 Pulled By: suo fbshipit-source-id: a29d369c755b6506c53447ca1f286b6339222c9a	2020-03-13 17:56:16 -07:00
Jerry Zhang	90ca7a1feb	[quant][graphmode] Add Finalize function that inlines graph and produce quantized ops (#33927 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33927 Test Plan: test will be added in later PRs Imported from OSS Differential Revision: D20354879 fbshipit-source-id: 03976f4b86c46dbdc4e45764a1e72f1a3855a404	2020-03-12 14:52:58 -07:00
Michael Suo	c235be42dd	[jit] kill script namespace (#34515 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34515 Once upon a time we thought this was necessary. In reality it is not, so removing it. For backcompat, our public interface (defined in `api/`) still has typedefs to the old `script::` names. There was only one collision: `Pass` as a `Stmt` and `Pass` as a graph transform. I renamed one of them. Test Plan: Imported from OSS Differential Revision: D20353503 Pulled By: suo fbshipit-source-id: 48bb911ce75120a8c9e0c6fb65262ef775dfba93	2020-03-11 23:32:48 -07:00
Jerry Zhang	2e7eef41ac	[quant][graphmode] Swap quantized functional linear with aten::linear (#33853 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33853 Quant fusion relies on inline, but inline will break the CallFunction("linaer", ...) into a if block it will be hard to recognize this block and swap it with quantized::linear, in order to preserve the op, we will swap all quantized functional linear into aten::linear. They might produce different backward graph, but this is called in the step before we get quantized model, so it shouldn't affect anything. We'll integrate this with convert_script later in the new "finalize_quant" API Test Plan: python test/test_jit.py Imported from OSS Differential Revision: D20343873 fbshipit-source-id: 423e03bf893b79267d2dc97bc997ee1bfe54ec0f	2020-03-09 15:45:20 -07:00
Jie	2b79bab029	[CUDA_FUSER] Fork CUDA fuser (#33527 ) Summary: Separating CUDA fuser from CPU fuser. 1. New node in IR - prim::CudaFusionGroup: This enables the cuda fuser to co-exist along side the old fuser. Allows us to incrementally build and expand cuda fuser. 2. copied FuseGraph optimization passes to CudaFuserGraph: We will re-factor & reuse Chunk/Concat in the old fuser logic, which is handled in the optimization pass at this moment. Unfortunately many code in the pass is tightly binded with the legacy fuser, which makes code sharing difficult. The CudaFusionGraph will support only a subset of operations comparing to legacy fuser (CUDA only). It is registered as a custom pass post fusion via ```torch._C._jit_register_cuda_fuser()``` To have it in effect, you should also turn off fusion on GPU via ```torch._C._jit_override_can_fuse_on_gpu(False)``` 3. We don't have codegen in this PR yet (WIP). Currently we just fall back to the old fuser. Pull Request resolved: https://github.com/pytorch/pytorch/pull/33527 Differential Revision: D20171598 Pulled By: ZolotukhinM fbshipit-source-id: 9a3c0f06f46da7eaa80ae7551c04869f5b03ef71	2020-03-04 20:25:08 -08:00
Zino Benaissa	cab8772c6c	Freezing Torchscript modules (#32178 ) Summary: This patch enables folding GetAttr nodes with their corresponding values. _jit_pass_freeze_module API returns a new TorchScipt module where all function calls and get attributes are inlined. Usage: frozen_model = torch._C._freeze_module(scrited_model._c) frozen_model.forward(...) This API currently optimizes the forward method. We will follow up to to preserve and optimize methods and attributes that are annotated as torch.jit.interface. Several future improvements to JIT optimizations are required to maximize clean up/de-sugar the graph and eliminate redundancies. Ideally, we want to produce a graph that can easily be lowered to GLOW and other low-level backends. __ Pull Request resolved: https://github.com/pytorch/pytorch/pull/32178 Differential Revision: D19419640 Pulled By: bzinodev fbshipit-source-id: 52baffaba9bca2cd60a8e747baa68d57711ad42b	2020-03-02 11:38:36 -08:00
Michael Suo	dbe850af5b	[jit] do the code reorg (#33851 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33851 Rationale and context described in #33828. Script to reproduce the move: https://gist.github.com/suo/16cbefaaeb67ca5a7c6caffd49b7f6e9 ghstack-source-id: 99079645 Test Plan: Make sure CI passes Reviewed By: jamesr66a Differential Revision: D20133869 fbshipit-source-id: 390e9241a9c85366d9005c492ac31f10aa96488e	2020-02-27 13:02:51 -08:00

36 Commits