pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Bert Maher	a709ab34a8	[nnc] Re-enable CPU fusion" (#63665 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63665 This reverts commit `125e2d02e5`. Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D30471646 Pulled By: bertmaher fbshipit-source-id: 4189869566f03b5f9ada78d78830f6a34946eed6	2021-08-23 12:42:42 -07:00
Alban Desmaison	125e2d02e5	Revert D30417370: [nnc] Enable CPU fusion Test Plan: revert-hammer Differential Revision: D30417370 (`b9fc656cf2`) Original commit changeset: 84ce7a578a36 fbshipit-source-id: cd23774cdc3273fd72f8a05f1900eaf36f373e6b	2021-08-20 12:30:21 -07:00
Bert Maher	b9fc656cf2	[nnc] Enable CPU fusion (#63545 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63545 Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D30417370 Pulled By: bertmaher fbshipit-source-id: 84ce7a578a3678d5562bab99d1dc00330c4f72d1	2021-08-20 11:18:21 -07:00
Nikita Shulga	30214aef2d	[BE] irangefy (#62928 ) Summary: Replace for loop with for `irange` loop. Also fix some unused variable warnings in range loop cases Pull Request resolved: https://github.com/pytorch/pytorch/pull/62928 Reviewed By: driazati Differential Revision: D30171904 Pulled By: malfet fbshipit-source-id: 1b437a0f7e3515f4a2e324f3450e93312f1933ae	2021-08-07 13:34:13 -07:00
Richard Barnes	f883ed9095	irange-ify 8b (#62195 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62195 Test Plan: Sandcastle Reviewed By: malfet Differential Revision: D29887946 fbshipit-source-id: e3bd44721cf06a34ced47994810212be8460a2bb	2021-07-26 15:38:54 -07:00
Mike Guo	6ecc1a4c4f	Make pytorch clang-tidy clean (#60649 ) Summary: This PR suppresses clang-tidy warnings in the codebase (for now) so that we can re-enable clang-tidy checks on master. I ran this script to add the `NOLINTNEXTLINE` comments (on a devserver): ```bash python3 setup.py develop # Uses same script that's run on CI and adds the -j (parallel), -s (add comments), -k (continue if diagnostic errors are found) options python3 tools/clang_tidy.py \ -j \ -s \ -k \ -v \ --paths torch/csrc/ \ -g"-torch/csrc/jit/passes/onnx/helper.cpp" \ -g"-torch/csrc/jit/passes/onnx/shape_type_inference.cpp" \ -g"-torch/csrc/jit/serialization/onnx.cpp" \ -g"-torch/csrc/jit/serialization/export.cpp" \ -g"-torch/csrc/jit/serialization/import.cpp" \ -g"-torch/csrc/jit/serialization/import_legacy.cpp" \ -g"-torch/csrc/onnx/init.cpp" \ -g"-torch/csrc/cuda/nccl." \ -g"-torch/csrc/cuda/python_nccl.cpp" \ -g"-torch/csrc/autograd/FunctionsManual.cpp" \ -g"-torch/csrc/generic/.cpp" \ -g"-torch/csrc/jit/codegen/cuda/runtime/*" \ -g"-torch/csrc/deploy/interpreter/interpreter.cpp" \ -g"-torch/csrc/deploy/interpreter/interpreter.h" \ -g"-torch/csrc/deploy/interpreter/interpreter_impl.h" \ -g"-torch/csrc/deploy/interpreter/test_main.cpp" ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/60649 Test Plan: Verified changes by re-running the script (without the `-s` option) and seeing no warnings/errors. Reviewed By: walterddr, janeyx99 Differential Revision: D29504258 Pulled By: 1ntEgr8 fbshipit-source-id: 78310b30ee8213b73ddb4771ad874665323e7a4e	2021-07-01 12:21:07 -07:00
Richard Barnes	3979cb0656	irange for size_t (#55320 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55320 Test Plan: Sandcastle Reviewed By: ngimel Differential Revision: D27572577 fbshipit-source-id: 97710fd2bb1303006b05828a0d1343b0b59ccb03	2021-06-03 01:04:13 -07:00
Nikita Shulga	3a66a1cb99	[clang-tidy] Exclude cppcoreguidelines-avoid-magic-numbers (#57841 ) Summary: Add cppcoreguidelines-avoid-magic-numbers exclusion to clang-tidy Remove existing nolint warnings using following script: ``` for file in `git ls-files \| grep -v \.py`; do gsed '/^ *\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-magic-numbers)/d' -i $file; done ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/57841 Reviewed By: samestep Differential Revision: D28295045 Pulled By: malfet fbshipit-source-id: 7c6e8d1213c9593f169ed3df6a916498f1a97163	2021-05-07 20:02:33 -07:00
Nikita Shulga	4cb534f92e	Make PyTorch code-base clang-tidy compliant (#56892 ) Summary: This is an automatic change generated by the following script: ``` #!/usr/bin/env python3 from subprocess import check_output, check_call import os def get_compiled_files_list(): import json with open("build/compile_commands.json") as f: data = json.load(f) files = [os.path.relpath(node['file']) for node in data] for idx, fname in enumerate(files): if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'): files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')] return files def run_clang_tidy(fname): check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"]) changes = check_output(["git", "ls-files", "-m"]) if len(changes) == 0: return check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"]) def main(): git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n") compiled_files = get_compiled_files_list() for idx, fname in enumerate(git_files): if fname not in compiled_files: continue if fname.startswith("caffe2/contrib/aten/"): continue print(f"[{idx}/{len(git_files)}] Processing {fname}") run_clang_tidy(fname) if __name__ == "__main__": main() ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892 Reviewed By: H-Huang Differential Revision: D27991944 Pulled By: malfet fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179	2021-04-28 14:10:25 -07:00
Mike Ruberry	c0ac0fef4e	Revert D27448156: irange for size_t Test Plan: revert-hammer Differential Revision: D27448156 (`041b4431b2`) Original commit changeset: 585da57d4de9 fbshipit-source-id: 8e047c29f391c0166e0a1a87c3fb2a0854377365	2021-04-03 19:14:00 -07:00
Richard Barnes	041b4431b2	irange for size_t (#55163 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55163 Test Plan: Sandcastle Reviewed By: ngimel Differential Revision: D27448156 fbshipit-source-id: 585da57d4de91c692b6360d65f7b8a66deb0f8c1	2021-04-02 23:22:29 -07:00
Edward Yang	c782949e17	Make the fuser raise NotImplementedError when unknown device is hit (#54709 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54709 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: eellison Differential Revision: D27338815 Pulled By: ezyang fbshipit-source-id: 5cbaf3c19b9b85cc3f171f3b405d0cd98f832e65	2021-03-27 11:55:47 -07:00
chengjun	4a8ef4525e	Add new backend type for Intel heterogeneous computation platform. (#49786 ) Summary: Add a new device type 'XPU' ('xpu' for lower case) to PyTorch. Changes are needed for code related to device model and kernel dispatch, e.g. DeviceType, Backend and DispatchKey etc. https://github.com/pytorch/pytorch/issues/48246 Pull Request resolved: https://github.com/pytorch/pytorch/pull/49786 Reviewed By: mrshenli Differential Revision: D25893962 Pulled By: ezyang fbshipit-source-id: 7ff0a316ee34cf0ed6fc7ead08ecdeb7df4b0052	2021-01-20 08:15:18 -08:00
Scott Wolchok	4a0d17ba2d	[PyTorch][codemod] Replace immediately-dereferenced expect calls w/expectRef (#50228 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50228 `fastmod -m 'expect(<((at\|c10)::)?\w+Type>\s*)->' 'expectRef${1}.'` Presuming it builds, this is a safe change: the result of `expect()` wasn't being saved anywhere, so we didn't need it, so we can take a reference instead of a new `shared_ptr`. ghstack-source-id: 119782961 Test Plan: CI Reviewed By: SplitInfinity Differential Revision: D25837374 fbshipit-source-id: 86757b70b1520e3dbaa141001e7976400cdd3b08	2021-01-13 16:13:55 -08:00
Meghan Lele	8746e1a1cc	[JIT] Fix clang-tidy warnings in jit/passes (#47984 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47984 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D25258638 Pulled By: SplitInfinity fbshipit-source-id: 0ed5ef6984ba988a2c67407efcc77355ca25bbee	2020-12-02 12:35:34 -08:00
generatedunixname89002005325676	671ee71ad4	[AutoAccept][Codemod][FBSourceClangFormatLinter] Daily `arc lint --take CLANGFORMAT` Reviewed By: zertosh Differential Revision: D25158667 fbshipit-source-id: 3b2a7facbfbfaaabc2cb5ac22906673b17fd0f15	2020-11-23 05:03:53 -08:00
Elias Ellison	a00ba63023	Disable old fuser internally (#48322 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48322 Disable old fuser internally. I would like to find where we are inadvertently setting the old fuser, but in the meantime I would like to land a diff that I know will 100% cause it not to be run, and verify that it fixes the issue. Test Plan: sandcastle Reviewed By: ZolotukhinM Differential Revision: D25126202 fbshipit-source-id: 5a4d0742f5f829e536f50e7ede1256c94dd05232	2020-11-21 00:42:23 -08:00
jiej	4360486346	pass strict_fuser_check for recursive fusion (#47221 ) Summary: We forgot to pass `strict_fuser_check` recursively to nested GraphFuser. Pull Request resolved: https://github.com/pytorch/pytorch/pull/47221 Reviewed By: zhangguanheng66 Differential Revision: D25060095 Pulled By: Krovatkin fbshipit-source-id: 31fe79c3bc080b637fce9aacc562d60708223321	2020-11-18 16:57:38 -08:00
Nikolay Korovaiko	993628c74a	Build shape expressions and remove outputs that are only used by `aten::size`s (#45080 ) Summary: Currently, TE materializes all intermediate results even if they are only used for computing their shapes. This diff ports the approach the OF (Old Fuser) took to deal with this issue. Namely, given the structure of a fusion group we infer all the sizes outside a fusion group based on fusion group's inputs. A simple example would be: ``` def test_fuse(a, b): c = a + b d = c + b return d ``` Here we don't need to cache `c` as computing a gradient for `b` in `d = c + b` doesn't need it. We do need to compute sizes for all arguments here in case broadcasts happen. Without this optimization, TE would need to materialize `c` so we can get its size ``` [DUMP profiling_graph_executor_impl.cpp:499] Optimized Graph: [DUMP profiling_graph_executor_impl.cpp:499] graph(%a.1 : Tensor, [DUMP profiling_graph_executor_impl.cpp:499] %b.1 : Tensor): [DUMP profiling_graph_executor_impl.cpp:499] %11 : Tensor = prim::DifferentiableGraph_0(%b.1, %a.1) [DUMP profiling_graph_executor_impl.cpp:499] return (%11) [DUMP profiling_graph_executor_impl.cpp:499] with prim::DifferentiableGraph_0 = graph(%11 : Tensor, [DUMP profiling_graph_executor_impl.cpp:499] %13 : Tensor): [DUMP profiling_graph_executor_impl.cpp:499] %59 : int[] = aten::size(%13) # <string>:3:44 [DUMP profiling_graph_executor_impl.cpp:499] %62 : int[] = aten::size(%11) # <string>:3:93 [DUMP profiling_graph_executor_impl.cpp:499] %83 : Double(1:1, requires_grad=0, device=cuda:0), %84 : Double(1:1, requires_grad=0, device=cuda:0), %85 : bool = prim::TypeCheck(%11, %13) [DUMP profiling_graph_executor_impl.cpp:499] %86 : Tensor, %87 : Tensor = prim::If(%85) [DUMP profiling_graph_executor_impl.cpp:499] block0(): [DUMP profiling_graph_executor_impl.cpp:499] %d.4 : Double(1:1, requires_grad=0, device=cuda:0), %c.4 : Double(1:1, requires_grad=0, device=cuda:0) = prim::TensorExprGroup_0(%83, %84) [DUMP profiling_graph_executor_impl.cpp:499] -> (%d.4, %c.4) [DUMP profiling_graph_executor_impl.cpp:499] block1(): [DUMP profiling_graph_executor_impl.cpp:499] %94 : Function = prim::Constant[name="fallback_function", fallback=1]() [DUMP profiling_graph_executor_impl.cpp:499] %95 : (Tensor, Tensor) = prim::CallFunction(%94, %11, %13) [DUMP profiling_graph_executor_impl.cpp:499] %96 : Tensor, %97 : Tensor = prim::TupleUnpack(%95) [DUMP profiling_graph_executor_impl.cpp:499] -> (%96, %97) [DUMP profiling_graph_executor_impl.cpp:499] %60 : int[] = aten::size(%87) # <string>:3:55 [DUMP profiling_graph_executor_impl.cpp:499] %61 : int[]? = aten::_size_if_not_equal(%59, %60) # <string>:3:19 [DUMP profiling_graph_executor_impl.cpp:499] %64 : int[]? = aten::_size_if_not_equal(%62, %60) # <string>:3:68 [DUMP profiling_graph_executor_impl.cpp:499] %67 : int[] = aten::size(%86) # <string>:3:55 [DUMP profiling_graph_executor_impl.cpp:499] %68 : int[]? = aten::_size_if_not_equal(%60, %67) # <string>:3:19 [DUMP profiling_graph_executor_impl.cpp:499] %71 : int[]? = aten::_size_if_not_equal(%62, %67) # <string>:3:68 [DUMP profiling_graph_executor_impl.cpp:499] return (%86, %61, %64, %68, %71) [DUMP profiling_graph_executor_impl.cpp:499] with prim::TensorExprGroup_0 = graph(%1 : Double(1:1, requires_grad=0, device=cuda:0), [DUMP profiling_graph_executor_impl.cpp:499] %4 : Double(1:1, requires_grad=0, device=cuda:0)): [DUMP profiling_graph_executor_impl.cpp:499] %5 : int = prim::Constant[value=1]() [DUMP profiling_graph_executor_impl.cpp:499] %c.3 : Double(1:1, requires_grad=0, device=cuda:0) = aten::add(%4, %1, %5) # /scratch/villedepommes/pytorches/bench/test/test_jit.py:2872:16 [DUMP profiling_graph_executor_impl.cpp:499] %2 : int = prim::Constant[value=1]() [DUMP profiling_graph_executor_impl.cpp:499] %d.3 : Double(1:1, requires_grad=0, device=cuda:0) = aten::add(%c.3, %1, %2) # /scratch/villedepommes/pytorches/bench/test/test_jit.py:2873:16 [DUMP profiling_graph_executor_impl.cpp:499] return (%d.3, %c.3) ``` With this optimization we use `prim::BroadcastSizes` to compute the size of `c`. No need to materialize it. ``` [DUMP profiling_graph_executor_impl.cpp:499] Optimized Graph: [DUMP profiling_graph_executor_impl.cpp:499] graph(%a.1 : Tensor, [DUMP profiling_graph_executor_impl.cpp:499] %b.1 : Tensor): [DUMP profiling_graph_executor_impl.cpp:499] %11 : Tensor = prim::DifferentiableGraph_0(%b.1, %a.1) [DUMP profiling_graph_executor_impl.cpp:499] return (%11) [DUMP profiling_graph_executor_impl.cpp:499] with prim::DifferentiableGraph_0 = graph(%11 : Tensor, [DUMP profiling_graph_executor_impl.cpp:499] %13 : Tensor): [DUMP profiling_graph_executor_impl.cpp:499] %59 : int[] = aten::size(%13) # <string>:3:44 [DUMP profiling_graph_executor_impl.cpp:499] %62 : int[] = aten::size(%11) # <string>:3:93 [DUMP profiling_graph_executor_impl.cpp:499] %88 : Double(1:1, requires_grad=0, device=cuda:0), %89 : Double(1:1, requires_grad=0, device=cuda:0), %90 : bool = prim::TypeCheck(%11, %13) [DUMP profiling_graph_executor_impl.cpp:499] %91 : Tensor = prim::If(%90) [DUMP profiling_graph_executor_impl.cpp:499] block0(): [DUMP profiling_graph_executor_impl.cpp:499] %d.4 : Double(1:1, requires_grad=0, device=cuda:0) = prim::TensorExprGroup_0(%88, %89) [DUMP profiling_graph_executor_impl.cpp:499] -> (%d.4) [DUMP profiling_graph_executor_impl.cpp:499] block1(): [DUMP profiling_graph_executor_impl.cpp:499] %97 : Function = prim::Constant[name="fallback_function", fallback=1]() [DUMP profiling_graph_executor_impl.cpp:499] %98 : (Tensor) = prim::CallFunction(%97, %11, %13) [DUMP profiling_graph_executor_impl.cpp:499] %99 : Tensor = prim::TupleUnpack(%98) [DUMP profiling_graph_executor_impl.cpp:499] -> (%99) [DUMP profiling_graph_executor_impl.cpp:499] %85 : int[] = aten::size(%91) [DUMP profiling_graph_executor_impl.cpp:499] %86 : int[] = prim::BroadcastSizes(%59, %62) [DUMP profiling_graph_executor_impl.cpp:499] %61 : int[]? = aten::_size_if_not_equal(%59, %86) # <string>:3:19 [DUMP profiling_graph_executor_impl.cpp:499] %64 : int[]? = aten::_size_if_not_equal(%62, %86) # <string>:3:68 [DUMP profiling_graph_executor_impl.cpp:499] %68 : int[]? = aten::_size_if_not_equal(%86, %85) # <string>:3:19 [DUMP profiling_graph_executor_impl.cpp:499] %71 : int[]? = aten::_size_if_not_equal(%62, %85) # <string>:3:68 [DUMP profiling_graph_executor_impl.cpp:499] return (%91, %61, %64, %68, %71) [DUMP profiling_graph_executor_impl.cpp:499] with prim::TensorExprGroup_0 = graph(%1 : Double(1:1, requires_grad=0, device=cuda:0), [DUMP profiling_graph_executor_impl.cpp:499] %4 : Double(1:1, requires_grad=0, device=cuda:0)): [DUMP profiling_graph_executor_impl.cpp:499] %5 : int = prim::Constant[value=1]() [DUMP profiling_graph_executor_impl.cpp:499] %c.3 : Double(1:1, requires_grad=0, device=cuda:0) = aten::add(%4, %1, %5) # /scratch/villedepommes/pytorches/bench/test/test_jit.py:2872:16 [DUMP profiling_graph_executor_impl.cpp:499] %2 : int = prim::Constant[value=1]() [DUMP profiling_graph_executor_impl.cpp:499] %d.3 : Double(1:1, requires_grad=0, device=cuda:0) = aten::add(%c.3, %1, %2) # /scratch/villedepommes/pytorches/bench/test/test_jit.py:2873:16 [DUMP profiling_graph_executor_impl.cpp:499] return (%d.3) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/45080 Reviewed By: bertmaher Differential Revision: D23856410 Pulled By: Krovatkin fbshipit-source-id: 2956286eb03a4894a5baa151c35e6092466322b1	2020-09-28 10:45:56 -07:00
Will Constable	94e8676a70	Initialize uninitialized variable (#42419 ) Summary: Fixes internal T70924595 Pull Request resolved: https://github.com/pytorch/pytorch/pull/42419 Reviewed By: allwu, Krovatkin Differential Revision: D22889325 Pulled By: wconstab fbshipit-source-id: 108b6a6c6bb7c98d77e22bae9974a6c00bc296f0	2020-08-04 11:35:54 -07:00
Yanan Cao	bdcf320bed	Support custom exception message (#41907 ) Summary: Raise and assert used to have a hard-coded error message "Exception". User provided error message was ignored. This PR adds support to represent user's error message in TorchScript. This breaks backward compatibility because now we actually need to script the user's error message, which can potentially contain unscriptable expressions. Such programs can break when scripting, but saved models can still continue to work. Increased an op count in test_mobile_optimizer.py because now we need aten::format to form the actual exception message. This is built upon an WIP PR: https://github.com/pytorch/pytorch/pull/34112 by driazati Pull Request resolved: https://github.com/pytorch/pytorch/pull/41907 Reviewed By: ngimel Differential Revision: D22778301 Pulled By: gmagogsfm fbshipit-source-id: 2b94f0db4ae9fe70c4cd03f4048e519ea96323ad	2020-08-01 13:03:45 -07:00
Will Constable	4c7fb8c2b6	make FusionCallback refer to specified GraphFuser context (#41560 ) Summary: Fixes issue where - top level fuser's block_ was captured by callback due to [&] capture, - recursive/nested fusers would compare erroneously to top-level block_ instead of own block_ Closes (https://github.com/pytorch/pytorch/issues/39810) Pull Request resolved: https://github.com/pytorch/pytorch/pull/41560 Reviewed By: Krovatkin Differential Revision: D22583196 Pulled By: wconstab fbshipit-source-id: 8f543cd9ea00e116cf3e776ab168cdd9fed69632	2020-07-28 15:01:24 -07:00
Xiaomeng Yang	80d5b3785b	Add torch.logit function (#41062 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41062 Add torch.logit function Test Plan: buck test mode/dev-nosan //caffe2/test:torch -- "logit" Reviewed By: hl475 Differential Revision: D22406912 fbshipit-source-id: b303374f4c68850eb7477eb0645546a24b844606	2020-07-13 19:33:20 -07:00
Michael Suo	9e32a1f5cd	[wip] update graph fuser aliasdb in-place (#37106 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37106 Recomputing the aliasdb on every fusion iteration + in every subblock is hugely expensive. Instead, update it in-place when doing fusion. The graph fuser pass operates by pushing nodes into a fusion group. So we start with ``` x, y = f(a, b, c) ``` and end with: ``` x_out, y_out = prim::fusionGroup(a, b, c) x_in, y_in = f(a_in, b_in, c_in) -> x_in, y_in ``` We destroy the `x` and `y` `Value*`s in the process. This operation is easy to express as an update to the aliasDb--`x_out` just takes on all the aliasing information `x` used to have. In particular, since we know `f` and `prim::fusionGroup` are purely functional, we don't have to mess with any write information. This PR is the bare minimum to get this working, in the interest of unscrewing the compilation times ASAP. Followups I want to do: - We don't have a way of expressing deletion of values in AliasDb. In `graph_fuser.cpp` we sometimes construct nodes that we end up throwing away, and we are littering `MemoryDAG` with references to dangling pointers. Because of the way the pass works, it's fine, but this is fragile so I want to fix it. - We should decouple alias analysis from write tracking, to simplify the job of keeping the write caches consistent as we mutate the aliasing information. - the tensorexpr fuser doesn't do this and thus is incorrect today, we need to update it to work. Test Plan: Imported from OSS Differential Revision: D21219179 Pulled By: suo fbshipit-source-id: 8ae5397b3a0ad90edec2fbc555647091f1ad5284	2020-04-30 22:21:35 -07:00
Meghan Lele	6384c2d81b	[JIT] clang-format JIT code (#35115 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35115 This commit runs the newly added tools/clang_format.py on the JIT codebase and includes all of the formatting changes thus produced. Testing: Ran the script, CI. Test Plan: Imported from OSS Reviewed By: eellison Differential Revision: D20568523 Pulled By: SplitInfinity fbshipit-source-id: e09bdb982ccf090eecfb7c7b461b8d0681eef82b	2020-03-26 11:24:51 -07:00
Basil Hosmer	ad769d74d9	Collapse _like overloads into a single overload. (#33705 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33705 The fact that there were two overloads appears to be a historical artifact that dates back to when goldsborough originally added these bindings in the first place. If TensorOptions is made optional, then you only need one overload, not two, as they are exactly redundant with each other. When MemoryFormat was added, it was made a little harder to do this, as the C++ syntax at::empty_like(t, memory_format) would not work if you collapsed the overload; but now it works because TensorOptions supports MemoryFormat. The upshot is, I can get rid of all the overloads and just have one overload. Amazingly, this change is backwards compatible, as the test attests. While I was at it, I also deleted the overload name from the functions entirely. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D20073355 Pulled By: bhosmer fbshipit-source-id: c6a8908213b32ccf6737ea864d135e2cce34f56b	2020-03-01 19:40:22 -08:00
Michael Suo	dbe850af5b	[jit] do the code reorg (#33851 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33851 Rationale and context described in #33828. Script to reproduce the move: https://gist.github.com/suo/16cbefaaeb67ca5a7c6caffd49b7f6e9 ghstack-source-id: 99079645 Test Plan: Make sure CI passes Reviewed By: jamesr66a Differential Revision: D20133869 fbshipit-source-id: 390e9241a9c85366d9005c492ac31f10aa96488e	2020-02-27 13:02:51 -08:00
Nikolay Korovaiko	a943b0518b	strict check for a device type in Fuser (#33025 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33025 Differential Revision: D19975873 Pulled By: Krovatkin fbshipit-source-id: 57f160bec9e4285dda63611f12665264754aac32	2020-02-20 23:53:27 -08:00
Mikhail Zolotukhin	806e7daa1f	Rename TorchScript compiler to IR emitter to better reflect its function. (#33127 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33127 Test Plan: Imported from OSS Differential Revision: D19806503 Pulled By: ZolotukhinM fbshipit-source-id: ab78bdbbac5f12dbcc6c2e2573f5862a16ffcf3d	2020-02-12 18:45:13 -08:00
Zachary DeVito	72a00a8a9c	Remove Node dependencies from operator.h (#32682 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32682 This moves code around so that operator.h/cpp no longer requires a full definition of Node* nor does it include alias analysis or the pretty printer. This should make it possible to include in the mobile build. Functionality for checking if operators match Node and to look up and operator for a Node have moved to the Node object. Test Plan: Imported from OSS Differential Revision: D19615386 Pulled By: zdevito fbshipit-source-id: e38bdf29971183597ef940d061c06ba56e71d9c5	2020-02-12 14:47:26 -08:00
comet	9a2691f2fc	Fix spelling errors Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32673 Differential Revision: D19597118 Pulled By: pietern fbshipit-source-id: f88c1da7548fcee141ed248f5f49d25c1d639955	2020-01-28 04:46:15 -08:00
Vitaly Fedyunin	4bfe2f0900	Fix jit outplace tracing and reapply changes to *_like operators. (#28839 ) Summary: Reapply reverted and fix files `gen_variable_type.py` `test_jit.py` https://github.com/pytorch/pytorch/issues/27891 Cleanup testing of _like operators https://github.com/pytorch/pytorch/issues/27890 Add memory format support to randn_like operator https://github.com/pytorch/pytorch/issues/27889 Add memory format support to randint_like operator https://github.com/pytorch/pytorch/issues/27562 Add memory format support to zeros_like operator https://github.com/pytorch/pytorch/issues/27561 Add memory format support to rand_like operator https://github.com/pytorch/pytorch/issues/27270 Add memory format support to ones_like operator https://github.com/pytorch/pytorch/issues/27262 Add memory format support to full_like operator Pull Request resolved: https://github.com/pytorch/pytorch/pull/28839 Test Plan: Imported from GitHub, without a `Test Plan:` line. buck test mode/dev //language_technology/neural_mt/os/pytorch_translate/test:test_onnx -- 'test_forced_decoder_export_vocab_reduction $language_technology\.neural_mt\.os\.pytorch_translate\.test\.test_onnx\.TestONNX$' Differential Revision: D18203397 Pulled By: VitalyFedyunin fbshipit-source-id: eea41cbd4c232cf5a54172b1e1b16b173798f298	2019-10-31 13:23:08 -07:00
Vitaly Fedyunin	266c1652e6	Back out "Add memory format support to `rand_like` operator" (#28801 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28801 Original commit changeset: 2a1d47571268 ghstack-source-id: 92748792 Test Plan: buck test language_technology/neural_mt/os/pytorch_translate/test:test_onnx Reviewed By: ifedan Differential Revision: D18175304 fbshipit-source-id: ffd61f6e42f256b39b80a6b42d989c238228f25d	2019-10-28 12:44:45 -07:00
Vitaly Fedyunin	04f5325583	Add memory format support to `rand_like` operator (#27561 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27561 Adds memory_format keyword argument (positional for cpp). 'Preserve' behavior now follows next rules: 1) If tensor is non-overlapping and dense - output tensor will have the same strides as input tensor. 2) If not (1) and tensor is stored in the channels last format, output tensor going to have channels last format. 3) Output tensor is going to be contiguous in all other cases. --- Dense tensor is the tensor that store values in a contiguous block of memory. Non-overlapping tensor is the tensor in which elements occupy individual non-repetitive memory. Test Plan: Imported from OSS Differential Revision: D17980316 Pulled By: VitalyFedyunin fbshipit-source-id: 2a1d47571268673de0c6f5ae1b6d4f9110962ab0	2019-10-25 07:29:12 -07:00
Mike Ruberry	ac7996ccd3	Removes SymbolicVariable (#25077 ) Summary: This PR excises the last of SymbolicVariable. There should be no change in functionality. One new test for addmm fusion was added. A case where the peephole optimizer might convert a scalar argument remains untested. Pull Request resolved: https://github.com/pytorch/pytorch/pull/25077 Test Plan: Refactors existing code so mostly covered by current tests. One test for addmm fusion was added. Differential Revision: D17145334 Pulled By: mruberry fbshipit-source-id: 6b68faf764f9ee8398b55c43110228ed9faf81eb	2019-08-31 11:19:50 -07:00
Zachary DeVito	bdc57d3833	Merge ProfiledTensorType and TensorType (#24284 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24284 This PR finishes the unification of all Tensor types into a single object. ProfiledTensorType is renamed to TensorType and the old TensorType is deleted. Notes: * Fixes bug in merge for VaryingShape by changing its representation to an optional list of optional ints. * Removes ProfiledTensorType::create(type) invocations that can now simply be expect calls on tensor type. Test Plan: Imported from OSS Differential Revision: D16794034 Pulled By: zdevito fbshipit-source-id: 10362398d0bb166d0d385d74801e95d9b87d9dfc	2019-08-20 13:01:28 -07:00
Nikolay Korovaiko	3d15ee1b34	Remove more uses of `DimensionedTensorType` Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23060 Differential Revision: D16460391 Pulled By: Krovatkin fbshipit-source-id: b50ee87d22ad18b8cbfff719b199ea876ef172f1	2019-08-01 21:19:28 -07:00
Thomas Viehmann	cf50249bde	Disable fusion of grad_sum_to_size (#23372 ) Summary: Fixes: https://github.com/pytorch/pytorch/issues/22833 grad_sum_to_size does not commute with AutogradAdd after all because it turns the broadcasting AutogradAdd into a broadcasting add. Chillee did actually do most of the tracking down to the fusion of grad_sum_to_size and pinging me when he had found the cause. Thank you! About the choice of removing the fusion completely instead of being more precise: - We do have grad_sum_to_size elimination which works for cases where broadcasting does not actually happen in the forward, so the cases where the fusing of grad_sum_to_size is actually beneficial is much smaller than when initially proposed. - There will be less fusion, in terms of the tests, IOU stops being fully fused. I vaguely think that it is a case we could handle with refined logic. - Keeping it would add complexity in checking when to merge fusion groups to the complexities that this PR removes. - The future of fusion probably lies more in more complete solutions including reductions (TVM or KeOps or our own or ...). Pull Request resolved: https://github.com/pytorch/pytorch/pull/23372 Differential Revision: D16489930 Pulled By: soumith fbshipit-source-id: bc0431b0d3eda264c401b634675872c4ce46f0f4	2019-07-25 08:55:33 -07:00
jjsjann123	252710262f	(#22775 ) Summary: passing FusionCallback and Symbol to recursive GraphFuser calls. It ensures consistent fusion in nested Blocks. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22775 Differential Revision: D16439979 Pulled By: soumith fbshipit-source-id: 18d4b13f52b03708b8580c73f75450adbb672ac1	2019-07-25 05:54:03 -07:00
Bram Wasti	05d56bd1b6	Remove hard-coded NVRTC specific constant from fuser header Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22699 Test Plan: Imported from OSS Differential Revision: D16192290 Pulled By: bwasti fbshipit-source-id: 4dccaf3e6e0151e86d35474c36e1ddb7f2afb5cf	2019-07-11 13:44:25 -07:00
Thomas Viehmann	17941f9979	JIT: Eliminate SumToSize by using Optional Lists (#18697 ) Summary: This PR is a eliminates unneeded grad_sum_to_size and in particular speeds up the LSTM backward by allowing better fusion. It consists of two parts: - In AutoDiff, record broadcasting sizes only if the broadcast output size is different from the input size, otherwise record None. - The specialization of Optional arguments (#18407) allows us to then eliminate ` _grad_sum_to_size(t, None)` in the peephole optimization step. Thus, in the LSTM case, no SumToSize remain in the crucial fusion group. The trick here is that we can specialize on the runtime information from the forward. I'm testing that different broadcasting situations lead to different graphs. I didn't move all symbolic_script _grad_sum_to_size to the new logic, but it might be better to do this incrementally, anyway. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18697 Differential Revision: D15482076 Pulled By: wanchaol fbshipit-source-id: 7f89367e35b8729910077c95c02bccefc8678afb	2019-05-24 11:24:17 -07:00
Wanchao Liang	871c9dcb1d	move batchnorm and layernorm fusion to decompose (#20337 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20337 ghimport-source-id: 2196f84f2ef384c1f25587b2fb4bd9dd2f63c2b4 Differential Revision: D15448596 Pulled By: wanchaol fbshipit-source-id: b66e608f1b72471fc0775aaa4e09f9fa1070fc3c	2019-05-22 18:01:27 -07:00
Bram Wasti	7b733e4fc1	Rebase conflict fix for isFusableDevice (#20251 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20251 ghimport-source-id: 0c8c1847a7979fcd77e4f6618730b170b6b8ce25 Differential Revision: D15262850 Pulled By: bwasti fbshipit-source-id: 17ecc340a310ddbcce141cfa3ee0efa9660194d2	2019-05-08 12:14:12 -07:00
Bram Wasti	4ca325df87	Add Custom graph fusion (#18588 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18588 ghimport-source-id: f40df177af8b87c73f04bf337f478a62133284cf Differential Revision: D14901297 Pulled By: bwasti fbshipit-source-id: 1b6371a5175b3d63dad542b7cc22cb82e8c6cfd0	2019-05-06 23:15:16 -07:00
Mikhail Zolotukhin	8b46938355	Cleanup includes in torch/csrc/jit/* (#19922 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19922 ghimport-source-id: 0434c46bf75621ff79ea27a18a2475e7f13e2487 Differential Revision: D15125015 Pulled By: ZolotukhinM fbshipit-source-id: 5685edfc94067f62e363a85e9badb7f757b1d321	2019-05-06 13:40:26 -07:00
Zachary DeVito	a425e1cbf8	Remove duplicate inlineCallToCode (#19724 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19724 ghimport-source-id: a68d28ac9bbe62dd61f03bfd9d57f4ef1d0ce9c9 Reviewed By: jamesr66a Differential Revision: D15078532 Pulled By: zdevito fbshipit-source-id: bebd34ff6105f538395260b027dc169448b5bc96	2019-04-25 15:53:10 -07:00
Wanchao Liang	c571969148	Fix the insert_guard for norm decomposation (#19646 ) Summary: move the insert_guard all the way up to the beginning of the decomposation, this will fix the case that we lose insert_point context after decomposeCommonNormalization and we still need to modify the graph. fixes #19502 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19646 Differential Revision: D15058040 Pulled By: wanchaol fbshipit-source-id: ebdbf8623ebfe4556c461e1b650e94b905791adb	2019-04-24 23:12:37 -07:00
James Reed	e7fc7c732c	Bugfix for fusion device check (#19594 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19594 I missed a callsite Reviewed By: wanchaol Differential Revision: D15041457 fbshipit-source-id: eef76ad51bee06a56d31b4ab64f19250fe2ad8f0	2019-04-22 20:55:17 -07:00
James Reed	5be4bee4ff	Don't create FusionGroups for known-CPU producer values (#19342 ) Summary: I believe the existing check in FuseGraph was only `false` if PyTorch was built with NO_CUDA=1. Otherwise, we would create fusion groups even if we're on a CPU-only machine running CPU code. This is confusing. Instead I've made it so that the decision to fuse or not is dependent on if the producer Value is a known CPU tensor. If it is, we skip fusion. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19342 Differential Revision: D15038351 Pulled By: jamesr66a fbshipit-source-id: fce9d83929309a7bf14346833f84b996f3e7f6db	2019-04-22 16:57:18 -07:00
Michael Suo	1e94a3bc4d	Turn resolver into a class (#19236 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19236 ghimport-source-id: d36705ea5ecff085d0d84ea57bb96d18d7c260dd Differential Revision: D14928292 Reviewed By: zdevito Pulled By: suo fbshipit-source-id: cd038100ac423fa1c19d0547b9e5487a633a2258	2019-04-19 13:01:59 -07:00

1 2 3

150 Commits