pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
John Clow	a9c2f11d2a	Update Freezing Logic and add new passes (#68024 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68024 Pull Request resolved: #67949 Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D32260614 Pulled By: eellison fbshipit-source-id: 41d7a9b45e33297a17560a22eba8973e2fc48b43	2021-11-09 13:21:52 -08:00
John Clow	ec8a71f9ac	Dtype Analysis for Unary and Binary ops with Metatensors (#66898 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66898 Test Plan: Imported from OSS Reviewed By: malfet Differential Revision: D32175961 Pulled By: Gamrix fbshipit-source-id: 72721259b900e5a311b6bcb5c350366ba420b734	2021-11-04 19:00:50 -07:00
Natalia Gimelshein	3d4a6ff15d	Revert D32154788: Move Concat Linear out of Optimize Numerics Test Plan: revert-hammer Differential Revision: D32154788 (`ea94dde573`) Original commit changeset: faa6465c89b3 fbshipit-source-id: 0dcaa65268b68ed01e6a5bc7b73ade1f51163b33	2021-11-04 12:20:02 -07:00
John Clow	ea94dde573	Move Concat Linear out of Optimize Numerics (#67196 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67196 Test Plan: Imported from OSS Reviewed By: eellison Differential Revision: D32154788 Pulled By: Gamrix fbshipit-source-id: faa6465c89b3676d6b1ff7c20a677738a7fbdf88	2021-11-04 11:30:39 -07:00
Elias Ellison	2486061c72	[JIT] make x (+ or -) 0 and x (* or /) 1 peepholes type promotion aware (#67688 ) Summary: Some of the "no-ops" are not actually no-ops because they can change the dtype Pull Request resolved: https://github.com/pytorch/pytorch/pull/67688 Reviewed By: davidberard98 Differential Revision: D32104601 Pulled By: eellison fbshipit-source-id: ccb99179a4b30fd20b5a9228374584f2cdc8ec21	2021-11-03 20:11:46 -07:00
Nikolay Korovaiko	3db536e55e	add jit_trace_module python binding (#67425 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67425 Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D31998564 Pulled By: Krovatkin fbshipit-source-id: f7e38c8c3f560f2c4e5ed62e1acae2c100efebd4	2021-11-02 23:55:23 -07:00
jjsjann123	1ec732bc46	Add fp16/fp32 autocasting to JIT/TorchScript (#63939 ) Summary: Adds mixed precision autocasting support between fp32/fp16 to torchscript/JIT. More in depth descriptoin can be found at [torch/csrc/jit/JIT-AUTOCAST.md](https://github.com/pytorch/pytorch/pull/63939/files#diff-1f1772aaa508841c5bb58b74ab98f49a1e577612cd9ea5c386c8714a75db830b) This PR implemented an autocast optimization pass that inserts casting ops per AMP rule (torch/csrc/jit/passes/autocast.cpp), that mimics the behavior of eager autocast. The pass also takes into consideration the context of `torch.cuda.amp.autocast` and only inserts casting ops within the enabled context manager, giving feature parity as with eager amp autocast. We currently provide JIT AMP autocast as a prototyping feature, so it is default off and could be turned on via `torch._C._jit_set_autocast_mode(True)` The JIT support for autocast is subject to different constraints compared to the eager mode implementation (mostly related to the fact that TorchScript is statically typed), restriction on the user facing python code is described in doc torch/csrc/jit/JIT-AUTOCAST.md This is a prototype, there are also implementation limitation that's necessary to keep this PR small and get something functioning quickly on upstream, so we can iterate on designs. Few limitation/challenge that is not properly resolved in this PR: 1. Autocast inserts cast operation, which would have impact on scalar type of output tensor feeding downstream operations. We are not currently propagating the updated scalar types, this would give issues/wrong results on operations in promotion rules. 2. Backward for autodiff in JIT misses the casting of dgrad to input scalar type, as what autograd does in eager. This forces us to explicitly mark the casting operation for certain operations (e.g. binary ops), otherwise, we might be feeding dgrad with mismatch scalar type to input. This could potentially break gradient function consuming dgrad. (e.g. gemm backwards, which assumes grad_output to be of same scalar type as input') 3. `torch.autocast` api has an optional argument `dtype` which is not currently supported in the JIT autocast and we require a static value. Credit goes mostly to: tlemo kevinstephano Pull Request resolved: https://github.com/pytorch/pytorch/pull/63939 Reviewed By: navahgar Differential Revision: D31093381 Pulled By: eellison fbshipit-source-id: da6e26c668c38b01e296f304507048d6c1794314	2021-10-27 12:11:36 -07:00
Nikolay Korovaiko	a7ebf76a15	jit trace (#59949 ) Summary: Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/59949 Reviewed By: ZolotukhinM Differential Revision: D31366787 Pulled By: Krovatkin fbshipit-source-id: 798cbcd97e8ecfba984f98cd70214954be9309af	2021-10-24 18:04:22 -07:00
Nikita Shulga	6f3f302d9f	[ONNX] Deprecate fold_if pass (#65697 ) (#66145 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66145 Deprecate fold_if pass Test Plan: Imported from OSS Reviewed By: jansel Differential Revision: D31424097 fbshipit-source-id: 25b89679c756393a1065ca6aaa24d29db960cbd4 Co-authored-by: jiafatom <jiafa@microsoft.com>	2021-10-22 13:46:20 -07:00
Nikita Shulga	53a163a015	[ONNX] Export nn.Module call as ONNX local function (#63589 ) (#66140 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66140 * Add new argument to export api to enable users specifying `nn.Module` classes that they wish to be exported as local function in ONNX model. * Refactor `torch/csrc/jit/serialization/export.cpp`, and remove redundant `EncoderBase` class. * ~~Contains changes from #63268~~ * Depends on #63716 to update onnx submodule. Test Plan: Imported from OSS Reviewed By: jansel Differential Revision: D31424098 fbshipit-source-id: c949d0b01c206c30b4182c2dd1a5b90e32b7a0d3 Co-authored-by: BowenBao <bowbao@microsoft.com>	2021-10-22 13:44:56 -07:00
Elias Ellison	63b41e1f4d	[JIT] Add partial evaluation graph stitching logic (#65377 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65377 When we run symbolic shape analysis on ``` conv = torch.nn.Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False) max_pool = torch.nn.MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False) mod = nn.Sequential(conv1, max_pool) ... graph(%self : __torch__.torch.nn.modules.container.___torch_mangle_0.Sequential, %input.1 : Tensor): %18 : bool = prim::Constant[value=0]() %30 : int[] = prim::Constant[value=[1, 1]]() %29 : int[] = prim::Constant[value=[3, 3]]() %28 : int[] = prim::Constant[value=[2, 2]]() %6 : int = prim::Constant[value=1]() %self.0.bias : NoneType = prim::Constant() %self.0.weight : Double(64, 3, 7, 7, strides=[147, 49, 7, 1], requires_grad=0, device=cpu) = prim::Constant[value=<Tensor>]() %input.5 : Tensor(SS(-2), 64, SS(-3), SS(-4)) = aten::conv2d(%input.1, %self.0.weight, %self.0.bias, %28, %29, %30, %6) %input.9 : Tensor(SS(-2), 64, SS(-5), SS(-6)) = aten::max_pool2d(%input.5, %29, %28, %30, %30, %18) return (%input.9) ``` we partially evaluate the shape compute graph of `conv2d`, whose output gets passed in and used to partially evaluate the shape compute graph of `max_pool2d`. The conv2d remaining partially eval'd graph is [here](https://gist.github.com/eellison/0598bd224a422211efa1a45d2b7560b7), and the maxpool2d eval'd graph is [here](https://gist.github.com/eellison/625540b84f650ddbefd3ae5511ab8814). We can take the partially eval'd graphs of a series of operators and stitch them together, which allows us to a) recover symbolic equivalences by CSE'ing & other optimizations b) calculate shapes for a whole block of operators just on the input, such as for fusing the whole model to nnc with dynamic shapes and then passing along the computed symbolic shapes. the calculation will also handle error handling. c) (future-looking) generate inputs on demand for straight-line networks that are composed just of aten operators The combined graph of the two gives us compute for the unknown symbolic dimensions - `SS(-2), SS(-3), SS(-4), SS(-5), and SS(-6)`. ``` graph(%input.1 : int[]): %42 : bool = prim::Constant[value=0]() # <string>:152:17 %15 : int = prim::Constant[value=3]() %input_batch_size_dim.1 : int = prim::Constant[value=0]() # <string>:417:41 %13 : int = prim::Constant[value=1]() # <string>:426:61 %12 : int = prim::Constant[value=4]() # <string>:437:32 %11 : str = prim::Constant[value="AssertionError: "]() %9 : int = prim::Constant[value=2]() %8 : int = prim::Constant[value=6]() %7 : int = prim::Constant[value=7]() %16 : int = aten::len(%input.1) # <string>:438:17 %17 : bool = aten::eq(%16, %12) # <string>:438:17 = prim::If(%17) # <string>:438:10 block0(): -> () block1(): = prim::RaiseException(%11) # <string>:438:10 -> () %18 : int = aten::__getitem__(%input.1, %13) # <string>:407:17 %19 : bool = aten::eq(%18, %15) # <string>:407:17 = prim::If(%19) # <string>:407:10 block0(): -> () block1(): = prim::RaiseException(%11) # <string>:407:10 -> () %20 : int = aten::__getitem__(%input.1, %9) # <string>:411:20 %21 : int = aten::add(%20, %8) # <string>:411:20 %22 : bool = aten::ge(%21, %7) # <string>:411:20 = prim::If(%22) # <string>:411:12 block0(): -> () block1(): = prim::RaiseException(%11) # <string>:411:12 -> () %23 : int = aten::__getitem__(%input.1, %15) # <string>:411:20 %24 : int = aten::add(%23, %8) # <string>:411:20 %25 : bool = aten::ge(%24, %7) # <string>:411:20 = prim::If(%25) # <string>:411:12 block0(): -> () block1(): = prim::RaiseException(%11) # <string>:411:12 -> () %26 : int = aten::__getitem__(%input.1, %input_batch_size_dim.1) # <string>:422:29 %27 : int = aten::sub(%20, %13) # <string>:428:32 %28 : int = aten::floordiv(%27, %9) # <string>:428:32 %29 : int = aten::add(%28, %13) # <string>:428:32 %30 : int = aten::sub(%23, %13) # <string>:428:32 %31 : int = aten::floordiv(%30, %9) # <string>:428:32 %32 : int = aten::add(%31, %13) # <string>:428:32 %48 : int = aten::floordiv(%28, %9) # <string>:133:17 %outputSize.2 : int = aten::add(%48, %13) # <string>:136:23 %51 : int = aten::floordiv(%31, %9) # <string>:133:17 %outputSize.1 : int = aten::add(%51, %13) # <string>:136:23 %53 : bool = aten::ne(%29, %input_batch_size_dim.1) # <string>:156:41 %54 : bool = prim::If(%53) # <string>:157:64 block0(): %55 : bool = aten::ne(%32, %input_batch_size_dim.1) # <string>:157:93 -> (%55) block1(): -> (%42) = prim::If(%54) # <string>:157:10 block0(): -> () block1(): = prim::RaiseException(%11) # <string>:157:10 -> () %56 : bool = aten::ge(%outputSize.1, %13) # <string>:160:17 %57 : bool = prim::If(%56) # <string>:160:17 block0(): %58 : bool = aten::ge(%outputSize.2, %13) # <string>:160:38 -> (%58) block1(): -> (%42) = prim::If(%57) # <string>:160:10 block0(): -> () block1(): = prim::RaiseException(%11) # <string>:160:10 -> () return (%26, %29, %32, %outputSize.2, %outputSize.1) ``` This PR runs shape analysis, retains the partially evaluated graphs, and then stitches them together, keeping track of what inputs in the partial eval graph correspond to what inputs in the encompassing graph IR and what outputs correspond to what symbolic shape. Adding NNC ppl as reviewers because it is relevant to dynamic shape fusion. Question for reviewers : should I make this a separate file ? Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D31797472 Pulled By: eellison fbshipit-source-id: a41ed31fad085d3563e71c815f49af0cd18aaeed	2021-10-20 16:12:58 -07:00
Michael Suo	70c9eb130d	Revert D31732419: [JIT] Add partial evaluation graph stitching logic Test Plan: revert-hammer Differential Revision: D31732419 (`5db7db667f`) Original commit changeset: 883a55cbeef0 fbshipit-source-id: f5faba69dfb6b54aeb29d1beaeec8c5b0373830f	2021-10-19 20:07:04 -07:00
Elias Ellison	5db7db667f	[JIT] Add partial evaluation graph stitching logic (#65377 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65377 When we run symbolic shape analysis on ``` conv = torch.nn.Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False) max_pool = torch.nn.MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False) mod = nn.Sequential(conv1, max_pool) ... graph(%self : __torch__.torch.nn.modules.container.___torch_mangle_0.Sequential, %input.1 : Tensor): %18 : bool = prim::Constant[value=0]() %30 : int[] = prim::Constant[value=[1, 1]]() %29 : int[] = prim::Constant[value=[3, 3]]() %28 : int[] = prim::Constant[value=[2, 2]]() %6 : int = prim::Constant[value=1]() %self.0.bias : NoneType = prim::Constant() %self.0.weight : Double(64, 3, 7, 7, strides=[147, 49, 7, 1], requires_grad=0, device=cpu) = prim::Constant[value=<Tensor>]() %input.5 : Tensor(SS(-2), 64, SS(-3), SS(-4)) = aten::conv2d(%input.1, %self.0.weight, %self.0.bias, %28, %29, %30, %6) %input.9 : Tensor(SS(-2), 64, SS(-5), SS(-6)) = aten::max_pool2d(%input.5, %29, %28, %30, %30, %18) return (%input.9) ``` we partially evaluate the shape compute graph of `conv2d`, whose output gets passed in and used to partially evaluate the shape compute graph of `max_pool2d`. The conv2d remaining partially eval'd graph is [here](https://gist.github.com/eellison/0598bd224a422211efa1a45d2b7560b7), and the maxpool2d eval'd graph is [here](https://gist.github.com/eellison/625540b84f650ddbefd3ae5511ab8814). We can take the partially eval'd graphs of a series of operators and stitch them together, which allows us to a) recover symbolic equivalences by CSE'ing & other optimizations b) calculate shapes for a whole block of operators just on the input, such as for fusing the whole model to nnc with dynamic shapes and then passing along the computed symbolic shapes. the calculation will also handle error handling. c) (future-looking) generate inputs on demand for straight-line networks that are composed just of aten operators The combined graph of the two gives us compute for the unknown symbolic dimensions - `SS(-2), SS(-3), SS(-4), SS(-5), and SS(-6)`. ``` graph(%input.1 : int[]): %42 : bool = prim::Constant[value=0]() # <string>:152:17 %15 : int = prim::Constant[value=3]() %input_batch_size_dim.1 : int = prim::Constant[value=0]() # <string>:417:41 %13 : int = prim::Constant[value=1]() # <string>:426:61 %12 : int = prim::Constant[value=4]() # <string>:437:32 %11 : str = prim::Constant[value="AssertionError: "]() %9 : int = prim::Constant[value=2]() %8 : int = prim::Constant[value=6]() %7 : int = prim::Constant[value=7]() %16 : int = aten::len(%input.1) # <string>:438:17 %17 : bool = aten::eq(%16, %12) # <string>:438:17 = prim::If(%17) # <string>:438:10 block0(): -> () block1(): = prim::RaiseException(%11) # <string>:438:10 -> () %18 : int = aten::__getitem__(%input.1, %13) # <string>:407:17 %19 : bool = aten::eq(%18, %15) # <string>:407:17 = prim::If(%19) # <string>:407:10 block0(): -> () block1(): = prim::RaiseException(%11) # <string>:407:10 -> () %20 : int = aten::__getitem__(%input.1, %9) # <string>:411:20 %21 : int = aten::add(%20, %8) # <string>:411:20 %22 : bool = aten::ge(%21, %7) # <string>:411:20 = prim::If(%22) # <string>:411:12 block0(): -> () block1(): = prim::RaiseException(%11) # <string>:411:12 -> () %23 : int = aten::__getitem__(%input.1, %15) # <string>:411:20 %24 : int = aten::add(%23, %8) # <string>:411:20 %25 : bool = aten::ge(%24, %7) # <string>:411:20 = prim::If(%25) # <string>:411:12 block0(): -> () block1(): = prim::RaiseException(%11) # <string>:411:12 -> () %26 : int = aten::__getitem__(%input.1, %input_batch_size_dim.1) # <string>:422:29 %27 : int = aten::sub(%20, %13) # <string>:428:32 %28 : int = aten::floordiv(%27, %9) # <string>:428:32 %29 : int = aten::add(%28, %13) # <string>:428:32 %30 : int = aten::sub(%23, %13) # <string>:428:32 %31 : int = aten::floordiv(%30, %9) # <string>:428:32 %32 : int = aten::add(%31, %13) # <string>:428:32 %48 : int = aten::floordiv(%28, %9) # <string>:133:17 %outputSize.2 : int = aten::add(%48, %13) # <string>:136:23 %51 : int = aten::floordiv(%31, %9) # <string>:133:17 %outputSize.1 : int = aten::add(%51, %13) # <string>:136:23 %53 : bool = aten::ne(%29, %input_batch_size_dim.1) # <string>:156:41 %54 : bool = prim::If(%53) # <string>:157:64 block0(): %55 : bool = aten::ne(%32, %input_batch_size_dim.1) # <string>:157:93 -> (%55) block1(): -> (%42) = prim::If(%54) # <string>:157:10 block0(): -> () block1(): = prim::RaiseException(%11) # <string>:157:10 -> () %56 : bool = aten::ge(%outputSize.1, %13) # <string>:160:17 %57 : bool = prim::If(%56) # <string>:160:17 block0(): %58 : bool = aten::ge(%outputSize.2, %13) # <string>:160:38 -> (%58) block1(): -> (%42) = prim::If(%57) # <string>:160:10 block0(): -> () block1(): = prim::RaiseException(%11) # <string>:160:10 -> () return (%26, %29, %32, %outputSize.2, %outputSize.1) ``` This PR runs shape analysis, retains the partially evaluated graphs, and then stitches them together, keeping track of what inputs in the partial eval graph correspond to what inputs in the encompassing graph IR and what outputs correspond to what symbolic shape. Adding NNC ppl as reviewers because it is relevant to dynamic shape fusion. Question for reviewers : should I make this a separate file ? Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D31732419 Pulled By: eellison fbshipit-source-id: 883a55cbeef0fd5a6068a779ffa89b6f537245b3	2021-10-19 16:41:19 -07:00
John Clow	3bad54069b	Concatting multiple linear layers with same input Tensor (different weight/bias) (#63198 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63198 Linear layers using the same input tensor can be concatted together as long as the weights and biases are compatible. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D31240642 fbshipit-source-id: 1e78daa6b89822412ba2513d326ee0e072ceff1e	2021-10-08 10:55:46 -07:00
John Clow	6cdea8239e	Precomputing Transposes for frozen linear layers (#65631 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65631 Test Plan: Imported from OSS Reviewed By: eellison Differential Revision: D31314248 Pulled By: Gamrix fbshipit-source-id: 85611f3ccfe7b91a183d5d12f7fb9aca3c51acb0	2021-10-05 20:08:32 -07:00
jjsjann123	d609957c95	patching graph_for (#55139 ) Summary: Allows individual DifferentiableGraphOp to display optimized forward graph. This improves user visibility to graph mutation via optimization pass, especially fusion. Pull Request resolved: https://github.com/pytorch/pytorch/pull/55139 Reviewed By: albanD Differential Revision: D31330909 Pulled By: dzhulgakov fbshipit-source-id: c745b482fdc34876dc404cbe3bacd99dcf2ac724	2021-10-04 21:50:22 -07:00
Hariom Narang	2828ce53fd	Added jit log stream changing function and some refactor (#65768 ) Summary: Description: - Have only added `stdout` and `stderr` as possible options from python API for now. We can do file path passing later maybe. - Put the class `JitLoggingConfig` in the cpp file as none of its methods were being used outside of this file. Python API: `torch._C._jit_set_logging_stream('stdout\|stderr')` C++ API: `::torch::jit::set_jit_logging_output_stream(ostream);` Testing: - Tested python API locally. - Unit test for the C++ API is written Fixes https://github.com/pytorch/pytorch/issues/54182 Pull Request resolved: https://github.com/pytorch/pytorch/pull/65768 Reviewed By: mrshenli Differential Revision: D31291739 Pulled By: ZolotukhinM fbshipit-source-id: eee72edc20488efad78a01c5b0ed8a132886a08d	2021-09-30 23:25:11 -07:00
Elias Ellison	928a4bbafb	[JIT] Fix compilation unit reference link in constant object upon load (#65784 ) Summary: Follow up to https://github.com/pytorch/pytorch/pull/65442, make sure objects inserted into the graph from load do not holding owning reference. Pull Request resolved: https://github.com/pytorch/pytorch/pull/65784 Reviewed By: suo Differential Revision: D31251033 Pulled By: eellison fbshipit-source-id: 59efe19ce6f70744383de4eebf0f89f79f3eb03a	2021-09-30 09:32:28 -07:00
Pruthvi Madugundu	085e2f7bdd	[ROCm] Changes not to rely on CUDA_VERSION or HIP_VERSION (#65610 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65610 - Replace HIP_PLATFORM_HCC with USE_ROCM - Dont rely on CUDA_VERSION or HIP_VERSION and use USE_ROCM and ROCM_VERSION. - In the next PR - Will be removing the mapping from CUDA_VERSION to HIP_VERSION and CUDA to HIP in hipify. - HIP_PLATFORM_HCC is deprecated, so will add HIP_PLATFORM_AMD to support HIP host code compilation on gcc. cc jeffdaily sunway513 jithunnair-amd ROCmSupport amathews-amd Reviewed By: jbschlosser Differential Revision: D30909053 Pulled By: ezyang fbshipit-source-id: 224a966ebf1aaec79beccbbd686fdf3d49267e06	2021-09-29 09:55:43 -07:00
David Berard	8eb21488fd	[JIT] Improve BatchMM mutability handling (#65097 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65097 Previously, BatchMM would skip any block containing any mutable operators. Now it will avoid batching any operation whose inputs or outputs are ever mutated. Specifically: consider a tree of ADD, T, and MM nodes rooted at an ADD node. If any input or output to any node in the tree is ever mutated, then the entire tree will be ignored by BatchMM. Test Plan: python test/test_jit.py TestBatchMM Reviewed By: eellison Differential Revision: D30973515 Pulled By: davidberard98 fbshipit-source-id: 9d836faa1ef0c9e3fefe0ffc0bd265f275471f48	2021-09-16 10:46:14 -07:00
James Reed	e1c3e5f830	[resubmit][FX] Prototype for guarding against mutable operations in tracing (#64467 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64467 Test Plan: Imported from OSS Reviewed By: driazati Differential Revision: D30744870 Pulled By: jamesr66a fbshipit-source-id: fc652f8b17748f90dbeb83fabf3bd5bb57d6ff1a	2021-09-02 21:13:21 -07:00
Eli Uriegas	32a93c2424	Revert D30675780: [FX] Prototype for guarding against mutable operations in tracing Test Plan: revert-hammer Differential Revision: D30675780 (`795387477f`) Original commit changeset: b2116b51dcc8 fbshipit-source-id: d4f1173f4989556ea54974f4c2739ef85a705fae	2021-09-02 16:07:29 -07:00
James Reed	795387477f	[FX] Prototype for guarding against mutable operations in tracing (#64295 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64295 Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D30675780 Pulled By: jamesr66a fbshipit-source-id: b2116b51dcc87357f0c84192c4c336680875e27a	2021-09-02 15:17:04 -07:00
Meghan Lele	95d0b3199b	Back out "[ONNX] Fix an issue that optimizations might adjust graph inputs unexpectedly. (#61280 )" (#64004 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64004 Pull Request resolved: https://github.com/pytorch/pytorch/pull/63904 Fixes T98808160 Test Plan: T98808160 Reviewed By: msaroufim Differential Revision: D30527450 fbshipit-source-id: 6262901a78ca929cecda1cf740893139aa26f1b4	2021-08-26 12:49:42 -07:00
Bert Maher	8dda299d96	Re-apply: [nnc] Support thread level parallelism in fused kernels (#63776 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63776 I reverted this out of an abundance of caution because some test failures occurred, but they were all due to precision issues fixed lower in this stack. Let's try again. I've rolled the elimination of the allow-parallelism-in-fusions toggle into this diff since they're pretty tightly coupled. ghstack-source-id: 136529847 Test Plan: CI Reviewed By: huiguoo Differential Revision: D30484555 fbshipit-source-id: 38fd33520f710585d1130c365a8c60c9ce794a59	2021-08-24 18:56:55 -07:00
Bert Maher	a709ab34a8	[nnc] Re-enable CPU fusion" (#63665 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63665 This reverts commit `125e2d02e5`. Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D30471646 Pulled By: bertmaher fbshipit-source-id: 4189869566f03b5f9ada78d78830f6a34946eed6	2021-08-23 12:42:42 -07:00
Bert Maher	76da46ccdc	Revert D30417127: Remove flag to toggle CPU fusion in the presence of parallelism Test Plan: revert-hammer Differential Revision: D30417127 (`6600bc9651`) Original commit changeset: b77d7c68364f fbshipit-source-id: 6b52fb83a84fe241945e3cb3eeb71050d1d9c8f1	2021-08-21 03:38:07 -07:00
BowenBao	8760254911	[ONNX] Fix an issue that optimizations might adjust graph inputs unexpectedly. (#61280 ) (#62763 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62763 This PR is to fix the issue that the graph inputs might be updated when we export the model in inference mode. When a model is export in inference mode, some optimizations will be made. One side effect of these optimizations is: the inputs of graph might be adjusted. Such optimizatiosn include: 1. Conv and BatchNorm op fusion. 2. Do constant folding. If the user sets export_params=False, or set keep_initializers_as_inputs=True, it's highly possible that the user wants to provide the corresponding parameters or initiliazers as the inputs of the graph. In such situation, no matter the model is export in inference mode or training mode, exporter needs to prevent above optimizations from adjusting the graph inputs. By this, the inputs of graph could match inputs that users provided. The changes in this PR, add an additional common judgement to see if the above optimizations needs to be done or not. From the value of export_params and keep_initializers_as_inputs arguments, infer if the graph inputs are allowed to be adjusted. If no, these optimizations will be ignored, even other requirements are matched. Besides these code changes, the comments of some parameters below have been updated so that users have more thoughts when they consider how to leverage these parameters for different purposes: 1. export_params 2. training 3. do_constant_folding 4. keep_initializers_as_inputs Test Plan: Imported from OSS Reviewed By: SplitInfinity Differential Revision: D30375183 Pulled By: msaroufim fbshipit-source-id: 4db8b9695649eb32a3a0fefa950ee2e5651bdba0 Co-authored-by: fatcat-z <jiz@microsoft.com>	2021-08-20 12:46:52 -07:00
Alban Desmaison	125e2d02e5	Revert D30417370: [nnc] Enable CPU fusion Test Plan: revert-hammer Differential Revision: D30417370 (`b9fc656cf2`) Original commit changeset: 84ce7a578a36 fbshipit-source-id: cd23774cdc3273fd72f8a05f1900eaf36f373e6b	2021-08-20 12:30:21 -07:00
Bert Maher	b9fc656cf2	[nnc] Enable CPU fusion (#63545 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63545 Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D30417370 Pulled By: bertmaher fbshipit-source-id: 84ce7a578a3678d5562bab99d1dc00330c4f72d1	2021-08-20 11:18:21 -07:00
Bert Maher	6600bc9651	Remove flag to toggle CPU fusion in the presence of parallelism (#63514 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63514 Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D30417127 Pulled By: bertmaher fbshipit-source-id: b77d7c68364f2af73570740540f3b1152313016e	2021-08-20 11:18:19 -07:00
Alban Desmaison	ce61100923	Revert D29399533: Hoisting common expressions out of If blocks Test Plan: revert-hammer Differential Revision: D29399533 (`9477211e7d`) Original commit changeset: 9336b9dc48c0 fbshipit-source-id: f081c7280203f40328bcbb0c03a7c6a007acedb7	2021-08-19 06:20:40 -07:00
John Clow	9477211e7d	Hoisting common expressions out of If blocks (#59492 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59492 Adding code to find common expressions from the two subblocks of an if operation and hoist them before the if block. This also allows Dead Code Elimination to then eliminate some if blocks. Also eliminated some dead code in the codebase. Test Plan: python test_jit.py TestIfHoisting Imported from OSS Reviewed By: ngimel Differential Revision: D29399533 fbshipit-source-id: 9336b9dc48c02c38862f98f98cd72fc1767a1802	2021-08-18 16:29:30 -07:00
Elias Ellison	ea808df25d	Test shape analysis with opinfos (#59814 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59814 Using opinfos to test shape analysis. By default, we just check that we don't give incorrect answers, and then if `assert_jit_shape_analysis` is true, tests that we correctly propagates the full shape. and it found a couple bugs {emoji:1f603} Test Plan: Imported from OSS Reviewed By: Krovatkin Differential Revision: D30200058 Pulled By: eellison fbshipit-source-id: 6226be87f5390277cfa5a1fffaa1b072d4bc8803	2021-08-10 09:47:33 -07:00
Richard Barnes	9e77113e85	irange-ify 11 (#62121 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62121 Test Plan: Sandcastle Reviewed By: ngimel Differential Revision: D29879701 fbshipit-source-id: 5c51879c88fa6a5790db241c8b33ec0dc4b177ca	2021-07-28 13:32:09 -07:00
Meghan Lele	05b802d4e0	[pytorch] Bring back RemoveInplaceOps() (#62200 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62200 This commit brings back the `RemoveInplaceOps` pass removed in D29523283 (`dec5aa2260`) that apparently had a bunch of internal users. Test Plan: danthe3rd Reviewed By: danthe3rd Differential Revision: D29833316 fbshipit-source-id: 6cf13d463ab0a5e50ba3eb3243f79a9c51623809	2021-07-28 12:00:38 -07:00
Gary Miguel	dec5aa2260	[JIT] clean up (#60390 ) Summary: * Minor: spelling, grammar. * Add calls to `GRAPH_DUMP()` where they were missing. * Add or expand a few comments. * Move a few comments to seemingly more appropriate spots. * In canonicalize_graph_fuser_ops.cpp inline `runnableInputs()` since it was only called in one place and had a misleading comment and confusing name. * In `PeepholeOptimizeImpl::optimizeBlock()`, set `changed = true;` when removing `aten::is_complex`. Pretty sure its absence was a bug. * Delete unused `_jit_pass_remove_inplace_ops` and and its implementation `RemoveInplaceOps()`. * In `preprocessCaffe2Ops()`, remove redundant check for nested optional types. It was already checked in `checkONNXCompatibility()`. * In `EncoderBase::AddAttribute`, log the unexpected attribute kind. I don't remember the repro case now but I did hit this error at some point and this additional logging made it easier to understand. * In `fuseConvBatchNorm()` in eval_peephole.cpp, consistently use camelCase instead of snake_case for local variables. * Add curly braces around the bodies of if and loops. Pull Request resolved: https://github.com/pytorch/pytorch/pull/60390 Reviewed By: Krovatkin Differential Revision: D29523283 Pulled By: SplitInfinity fbshipit-source-id: 4e16c5648616f53da07d68dab7fdf252e06a0752	2021-07-09 16:28:27 -07:00
Bert Maher	93772792e3	[nnc] Get rid of fuser trigger counters (#57334 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57334 Here's a possibly controversial PR. These counters got in the way of generalizing the fuser tests to handle arbitrary devices, and I guess I'm just generally skeptical that they provide much value. While true that they let us observe whether fusion groups were created, we already have assertions based on the shape of the graph, and I'm not sure that I trust those any less than these counters. Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D29471484 Pulled By: bertmaher fbshipit-source-id: f6d76f6e72dbfb581acff1d834b0c74500941b57	2021-06-29 22:22:15 -07:00
Lily Johnson	0dd90cceaf	[package] track storages across lifetime of PackageExporter (#59735 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59735 1. Fixes ABA storage identity problem during serialization for `torch.package` by keeping reference of serialized storages through lifetime of `PackageExporter` to prevent reuse of memory address. Achieved by extending logic used in solution to mobile's same issue. 2. Adds determinism to naming scheme of serialized storages in export code paths which utilize `tensor_cdata_naming_scheme`(introduced 2nd mapping in `StorageContext`, now maps `storage cdata ptr` -> `unique id`, `unique id` -> `c10::Storage`) 3. Additionally uses presence of a storage in the `StorageContext` instance as marker for if a storage has been serialized or not, removing the need to scan the `PythonStreamWriter` for presence of the storage's serialization file Test Plan: Imported from OSS Reviewed By: suo Differential Revision: D29075276 Pulled By: Lilyjjo fbshipit-source-id: 15a5c30b1de99c5bd7079388f2db9b6ece2eca12	2021-06-29 14:16:54 -07:00
Hariom Narang	9d1d799034	Added API to change logging levels for JIT (#58821 ) Summary: Description: - Before this, logging level could only be changed by changing the env variable "PYTORCH_JIT_LOG_LEVEL" - Can change the level from python now - Have not added stream configuration for now - Configuration is stored in a singleton class managing the options Issue Link: https://github.com/pytorch/pytorch/issues/54188 Gotchas: - Created separate functions `::torch::jit::get_jit_logging_levels/set_jit_logging_levels` instead of using the singleton class's method directly - This is because when running test cases, two different instances of the singleton are created for the test suite and the actual code (`jit_log.cpp`) - On using these methods directly, `is_enabled` calls the singleton in `jit_log.cpp` while we are setting the config using another singleton - See: https://stackoverflow.com/questions/55467246/my-singleton-can-be-called-multiple-times API: - To set the level: `torch._C._jit_set_logging_option("level")` - To get the level: `torch._C._jit_get_logging_option()` Testing: - UTs were added for C++ - A very simple UT was added for python to just check if the API is being called correctly - The API was checked by running trace in a sample python file - Set env variable to "" and used `_jit_set_logging_option` in python to set the variable to `>dead_code_elimination` - The error output had logs of form [DUMP..] [UPDATE...] etc Fixes https://github.com/pytorch/pytorch/issues/54188 Pull Request resolved: https://github.com/pytorch/pytorch/pull/58821 Reviewed By: soulitzer Differential Revision: D29116712 Pulled By: ZolotukhinM fbshipit-source-id: 8f2861ee2bd567fb63b405953d035ca657a3200f	2021-06-21 16:10:49 -07:00
Bin Bao	add291cf66	[JIT] Add a phase to perform inplace<->functional conversion for activation operators (#57477 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57477 Currently the conversion only deals with activation operators. The legality check is somewhat strict for now. Test Plan: ``` python test/test_jit.py -k test_functional_to_inplace_activation python test/test_jit.py -k test_inplace_to_functional_activation ``` Reviewed By: mrshenli Differential Revision: D28155153 Pulled By: desertfire fbshipit-source-id: df092830c4dff3ce9578ff76285eb7a566b7d81b	2021-06-03 06:43:23 -07:00
eellison	d8cbba3ee2	[JIT] Disable Complete Shape Inlining For Testing Purposes (#56966 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56966 This PR adds a toggle to shape analysis which won't inline complete tensor shapes as constants into the shape compute graph, which is a good stress test on the partial evaluation pipeline. Test Plan: Imported from OSS Reviewed By: bdhirsh Differential Revision: D28444664 Pulled By: eellison fbshipit-source-id: a62e424515a8837a4b596546efa93af5e8e61f10	2021-05-27 17:57:48 -07:00
eellison	f66fbb1e2e	Add unary/binary ops necessary for mobilenet (#56828 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56828 Test Plan: Imported from OSS Reviewed By: bdhirsh Differential Revision: D28444660 Pulled By: eellison fbshipit-source-id: 656673e6139550f2752c0d3ac2fb8731f4bf9bbb	2021-05-27 17:56:30 -07:00
Kimish Patel	e067675167	[Pytorch] Provide API to preserve source range and callstack information during graph rewrite (#58300 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58300 Current state: During graph rewriting that can fuse nodes or add nodes result in new nodes without debug information that was available in original node. Thus we lose this information during graph rewrite. This PR changes graph rewriting API to let user specify how the values in the replacement pattern map to values in the pattern to be matched. Then the graph rewriting will copy source range and inlined callstack from the matched nodes onto the nodes being inserted. (Note: this ignores all push blocking failures!) Test Plan: python test/test_jit.py TestJit.test_pattern_based_rewrite_with_source_range_preserved Imported from OSS Reviewed By: malfet Differential Revision: D28512465 fbshipit-source-id: 863173c29de726be85b3acbd3ddf3257eea36d13	2021-05-25 09:18:59 -07:00
Elias Ellison	5313bafd31	[JIT] integer value refinement (#56438 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56438 Test Plan: Imported from OSS Reviewed By: nikithamalgifb Differential Revision: D27924239 Pulled By: eellison fbshipit-source-id: ace54fcb594853f30c242369ea203b0eb5527ac1	2021-05-21 08:51:01 -07:00
Elias Ellison	5cebf29b4e	Add list len refinement (#55926 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55926 This is necessary for code like conv2d where we wish to share a generic convolution shape function logic with that of conv2d but for conv2d always infer the output is dimension 4. I'm also hoping the refinement algorithm here could be refactored out and used to support refining tensor types from user annotations. i have a length comment explaining how this works, and the logic outside of data structures is pretty small and contained. Additionally, you might check out https://fb.quip.com/X7EVAdQ99Zzm for a very similar description of how to refine values based on comparison operators. Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D27750997 Pulled By: eellison fbshipit-source-id: d962415af519ac37ebc9de88f2e1ea60a1374f7c	2021-05-21 08:50:54 -07:00
Elias Ellison	9fd2306036	Add handling of symbolic shapes (#55925 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55925 This sets up the initial handling of symbolic shapes. As in the test, it doesn't work perfectly yet because it needs a couple other optimization passes. The basic description is pretty simple: we resolve tensor dimension indices to the same Value *, and before extracting out the output Tensor shape we substitute in symbolic shapes. We don't substitute during optimization because they are represented as negative numbers so we don't want them inadvertently used in Constant prop or something else. Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D27750996 Pulled By: eellison fbshipit-source-id: 6984e7276b578f96b00fc2025cef0e13f594b6e6	2021-05-21 08:50:52 -07:00
Elias Ellison	f39471a171	Initial Symbolic Shape Analysis (#54809 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54809 I'm going to post on dev-discuss soon with a more thorough explanation of the design and advantages of this shape analysis, so I'm leaving out that for now. There is still a ton left to do, I'm posting this initial version so we can get something on master multiple can work on. List of many remaining steps to do: - [ ] Add symbolic shapes support - [ ] Bind shape functions for operators in C++ - [ ] Make classes of operators share the same shape function (e.g. pointwise, broadcast two inputs) - [ ] Refactor APIs - [ ] Only iteratively optimize shape function while a change has been made - [ ] Expand coverage of coverage to common ops - [ ] Add shape analysis pass on Graph that handles Ifs and Loops - [ ] Allow concurrent reads to the operator map - [ ] Successive applications of same inputs to same shape function (e.g. series of pointwise ops) For this review, I am mostly looking for comments related to the implementation of symolic_shape_analysis.cpp, with the caveats listed above. I am not really looking for comments related to api/registration/graph level analysis as those are all planned to be changed. I am fine landing this as is or waiting until necessary components of the TODOs above are finished. Test Plan: Imported from OSS Reviewed By: pbelevich Differential Revision: D27750998 Pulled By: eellison fbshipit-source-id: 4338b99e8651df076291c6b781c0e36a1bcbec03	2021-05-21 08:49:46 -07:00
Raghavan Raman	3fe72d30dc	[NNC] Optimize conditionals that correspond to the form generated for aten::cat op. (#57673 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57673 Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D28231374 Pulled By: navahgar fbshipit-source-id: 1777a63df4e5ebed6d515683bd772a88be465b3a	2021-05-18 14:23:48 -07:00
Luca Wehrstedt	5a238eb96e	Fix deadlock in Future due to lock inversion with GIL (#58382 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58382 Calling markCompleted on a Future now first acquires the Future's mutex (as usual) but then sometimes tries to acquire the GIL during the DataPtr extraction while still holding the Future's mutex. (This happens when the value passed to markCompleted is a Python object). This can cause a deadlock if someone else calls any of the other methods of Future while holding the GIL. There are two solutions to this: avoid holding the Future's mutex when extracting DataPtrs, and avoid holding the GIL while invoking the Future's method. In this PR I'm going for the latter, because it's a very simple immediate fix, but I believe this is brittle and that we should probably also consider the former fix. ghstack-source-id: 129105358 Test Plan: The repro in https://github.com/pytorch/pytorch/issues/58239 now doesn't deadlock. Reviewed By: mrshenli Differential Revision: D28472816 fbshipit-source-id: 1bc9bca426dd004f9eb2568db1ffd38f014450e2	2021-05-17 10:53:19 -07:00

1 2 3 4 5

213 Commits