pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Raghavan Raman	dd7bbe1a63	[NNC] Make splitWithMask transform in-place (#58269 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58269 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D28427227 Pulled By: navahgar fbshipit-source-id: 4e38a436abcf4752fd7ef6ab3666876eec6ea5ba	2021-05-25 11:32:51 -07:00
CodemodService FBSourceClangFormatLinterBot	cbfce376a8	[AutoAccept][Codemod][FBSourceClangFormatLinter] Daily `arc lint --take CLANGFORMAT` Reviewed By: zertosh Differential Revision: D28319469 fbshipit-source-id: 8295597a8ee16b2fef3f7aacdd6c892cb22db988	2021-05-10 03:39:31 -07:00
Nikita Shulga	3a66a1cb99	[clang-tidy] Exclude cppcoreguidelines-avoid-magic-numbers (#57841 ) Summary: Add cppcoreguidelines-avoid-magic-numbers exclusion to clang-tidy Remove existing nolint warnings using following script: ``` for file in `git ls-files \| grep -v \.py`; do gsed '/^ *\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-magic-numbers)/d' -i $file; done ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/57841 Reviewed By: samestep Differential Revision: D28295045 Pulled By: malfet fbshipit-source-id: 7c6e8d1213c9593f169ed3df6a916498f1a97163	2021-05-07 20:02:33 -07:00
Nikita Shulga	4cb534f92e	Make PyTorch code-base clang-tidy compliant (#56892 ) Summary: This is an automatic change generated by the following script: ``` #!/usr/bin/env python3 from subprocess import check_output, check_call import os def get_compiled_files_list(): import json with open("build/compile_commands.json") as f: data = json.load(f) files = [os.path.relpath(node['file']) for node in data] for idx, fname in enumerate(files): if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'): files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')] return files def run_clang_tidy(fname): check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"]) changes = check_output(["git", "ls-files", "-m"]) if len(changes) == 0: return check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"]) def main(): git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n") compiled_files = get_compiled_files_list() for idx, fname in enumerate(git_files): if fname not in compiled_files: continue if fname.startswith("caffe2/contrib/aten/"): continue print(f"[{idx}/{len(git_files)}] Processing {fname}") run_clang_tidy(fname) if __name__ == "__main__": main() ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892 Reviewed By: H-Huang Differential Revision: D27991944 Pulled By: malfet fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179	2021-04-28 14:10:25 -07:00
Mikhail Zolotukhin	7ab654afd7	[TensorExpr] Rename `Tensor::call` to `Tensor::load` to be consistent with `Buf` and `Placeholder`. (#55826 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55826 It's a mechanical change. Differential Revision: D27717777 Test Plan: Imported from OSS Reviewed By: navahgar Pulled By: ZolotukhinM fbshipit-source-id: fbc1bb99602250c706cf2c8c2684119c323e4d51	2021-04-13 12:08:53 -07:00
Mikhail Zolotukhin	1263448cb2	[TensorExpr] Remove mask field from Load and Store classes. (#55825 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55825 The mask has never been used (in vectorization we generate an explicit `IfThenElse` construct when we need to mask out some elements). The PR removes it and cleans up all its traces from tests. Differential Revision: D27717776 Test Plan: Imported from OSS Reviewed By: navahgar Pulled By: ZolotukhinM fbshipit-source-id: 41d1feeea4322da75b3999d661801c2a7f82b9db	2021-04-13 12:08:51 -07:00
Joel Schlosser	defc649eca	Update to short forms of splitWithTail / splitWithMask (#55542 ) Summary: Switched to short forms of `splitWithTail` / `splitWithMask` for all tests in `test/cpp/tensorexpr/test_*.cpp` (except test_loopnest.cpp) Pull Request resolved: https://github.com/pytorch/pytorch/pull/55542 Reviewed By: mrshenli Differential Revision: D27632033 Pulled By: jbschlosser fbshipit-source-id: dc2ba134f99bff8951ae61e564cd1daea92c41df	2021-04-09 10:15:20 -07:00
Mikhail Zolotukhin	688e350725	[TensorExpr] Nuke DepTracker and findAllNeededTensors. (#54997 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54997 DepTracker was used to automatically pull in dependent computations from output ones. While it seems quite convenient, it's led to several architectural issues, which are fixed in this stack. DepTracker worked on Tensors, which is a pair of Buf and Stmt. However, Stmt could become stale and there was no way to reliably update the corresponding tensor. We're now using Bufs and Stmts directly and moving away from using Tensors to avoid these problems. Removing DepTracker allowed to unify Loads and FunctionCalls, which essentially were duplicates of each other. Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D27446414 Pulled By: ZolotukhinM fbshipit-source-id: a2a32749d5b28beed92a601da33d126c0a2cf399	2021-04-01 19:46:26 -07:00
Andres Suarez	8530c65e25	[codemod][fbcode/caffe2] Apply clang-format update fixes Test Plan: Sandcastle and visual inspection. Reviewed By: igorsugak Differential Revision: D25849205 fbshipit-source-id: ef664c1ad4b3ee92d5c020a5511b4ef9837a09a0	2021-01-09 14:37:36 -08:00
Nick Gibson	db2e9c1e7f	[NNC] Intermediate allocs flattened and dependency support (#49554 ) Summary: Makes two changes in NNC for intermediate buffer allocations: 1. Flattens dimensions of buffers allocated in LoopNest::prepareForCodegen() to match their flattened usages. 2. Adds support for tracking memory dependencies of Alloc/Free to the MemDependencyChecker, which will allow us to check safety of accesses to intermediate buffers (coming in a future diff). I didn't add any new tests as the mem dependency checker tests already cover it pretty well, particularly the GEMM test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49554 Reviewed By: VitalyFedyunin Differential Revision: D25643133 Pulled By: nickgg fbshipit-source-id: 66be3054eb36f0a4279d0c36562e63aa2dae371c	2020-12-21 10:35:15 -08:00
Bert Maher	07657b6001	[tensorexpr] Switch cpp tests to pure gtest (#48160 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48160 We no longer use the custom c++ test infra anyways, so move to pure gtest. Fixes #45703 ghstack-source-id: 116977283 Test Plan: `buck test //caffe2/test/cpp/tensorexpr` Reviewed By: navahgar, nickgg Differential Revision: D25046618 fbshipit-source-id: da34183d87465f410379048148c28e1623618553	2020-11-18 12:23:34 -08:00
Nick Gibson	eab809377d	[NNC] Remove all deferred expansion from Reductions (#47709 ) Summary: Refactors the ReduceOp node to remove the last remaining deferred functionality: completing the interaction between the accumulator buffer and the body. This fixes two issues with reductions: 1. Nodes inside the interaction could not be visited or modified, meaning we could generate bad code when the interaction was complex. 2. The accumulator load was created at expansion time and so could not be modified in some ways (ie. vectorization couldn't act on these loads). This simplifies reduction logic quite a bit, but theres a bit more involved in the rfactor transform. Pull Request resolved: https://github.com/pytorch/pytorch/pull/47709 Reviewed By: ZolotukhinM Differential Revision: D24904220 Pulled By: nickgg fbshipit-source-id: 159e5fd967d2d1f8697cfa96ce1bb5fc44920a40	2020-11-12 20:17:52 -08:00
Nick Gibson	76ff557de7	[NNC] add hazard analysis to Bounds Inference (#47684 ) Summary: Adds a helper function to Bounds Inference / Memory Analaysis infrastructure which returns the kind of hazard found between two Stmts (e.g. Blocks or Loops). E.g. ``` for (int i = 0; i < 10; ++i) { A[x] = i * 2; } for (int j = 0; j < 10; ++j) { B[x] = A[x] / 2; } ``` The two loops have a `ReadAfterWrite` hazard, while in this example: ``` for (int i = 0; i < 10; ++i) { A[x] = i * 2; } for (int j = 0; j < 10; ++j) { A[x] = B[x] / 2; } ``` The loops have a `WriteAfterWrite` hazard. This isn't 100% of what we need for loop fusion, for example we don't check the strides of the loop to see if they match. Pull Request resolved: https://github.com/pytorch/pytorch/pull/47684 Reviewed By: malfet Differential Revision: D24873587 Pulled By: nickgg fbshipit-source-id: 991149e5942e769612298ada855687469a219d62	2020-11-12 11:34:31 -08:00
Nick Gibson	0edc6a39c8	[NNC] Read/Write Dependency analysis (#46952 ) Summary: Adds a new piece of infrastructure to the NNC fused-kernel generation compiler, which builds a dependency graph of the reads and writes to memory regions in a kernel. It can be used to generate graphs like this from the GEMM benchmark (not this only represents memory hierarchy not compute hierarchy): ![image](https://user-images.githubusercontent.com/701287/97368797-e99d5600-1868-11eb-9a7e-ceeb91ce72b8.png) Or to answer questions like this: ``` Tensor* c = Compute(...); Tensor* d = Compute(...); LoopNest loop({d}); MemDependencyChecker analyzer; loop.root_stmt()->accept(analyzer); if (analyzer.dependsDirectly(loop.getLoopStmtsFor(d)[0], loop.getLoopStmtsFor(c)[0]) { // do something, maybe computeInline } ``` Or this: ``` Tensor* d = Compute(...); LoopNest loop({d}); MemDependencyChecker analyzer(loop.getInputs(), loop.getOutputs()); const Buf* output = d->buf(); for (const Buf* input : inputs) { if (!analyzer.dependsIndirectly(output, input)) { // signal that this input is unused } } ``` This is a monster of a diff, and I apologize. I've tested it as well as possible for now, but it's not hooked up to anything yet so should not affect any current usages of the NNC fuser. How it works: Similar to the registerizer, the MemDependencyChecker walks the IR aggregating memory accesses into scopes, then merges those scopes into their parent scope and tracks which writes are responsible for the last write to a particular region of memory, adding dependency links where that region is used. This relies on a bunch of math on symbolic contiguous regions which I've pulled out into its own file (bounds_overlap.h/cpp). Sometimes this wont be able to infer dependence with 100% accuracy but I think it should always be conservative and occaisionally add false positives but I'm aware of no false negatives. The hardest part of the analysis is determining when a Load inside a For loop depends on a Store that is lower in the IR from a previous iteration of the loop. This depends on a whole bunch of factors, including whether or not we should consider loop iteration order. The analyzer comes with configuration of this setting. For example this loop: ``` for (int i = 0; i < 10; ++i) { A[x] = B[x] + 1; } ``` has no inter loop dependence, since each iteration uses a distinct slice of both A and B. But this one: ``` for (int i = 0; i < 10; ++i) { A[0] = A[0] + B[x]; } ``` Has a self loop dependence between the Load and the Store of A. This applies to many cases that are not reductions as well. In this example: ``` for (int i =0; i < 10; ++i) { A[x] = A[x+1] + x; } ``` Whether or not it has self-loop dependence depends on if we are assuming the execution order is fixed (or whether this loop could later be parallelized). If the read from `A[x+1]` always comes before the write to that same region then it has no dependence. The analyzer can correctly handle dynamic shapes, but we may need more test coverage of real world usages of dynamic shapes. I unit test some simple and pathological cases, but coverage could be better. Next Steps: Since the PR was already so big I didn't actually hook it up anywhere, but I had planned on rewriting bounds inference based on the dependency graph. Will do that next. There are few gaps in this code which could be filled in later if we need it: * Upgrading the bound math to work with write strides, which will reduce false positive dependencies. * Better handling of Conditions, reducing false positive dependencies when a range is written in both branches of a Cond. * Support for AtomicAdd node added in Cuda codegen. Testing: See new unit tests, I've tried to be verbose about what is being tested. I ran the python tests but there shouldn't be any way for this work to affect them yet. Pull Request resolved: https://github.com/pytorch/pytorch/pull/46952 Reviewed By: ejguan Differential Revision: D24730346 Pulled By: nickgg fbshipit-source-id: 654c67c71e9880495afd3ae0efc142e95d5190df	2020-11-04 19:52:20 -08:00

14 Commits