pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Sebastian Messmer	bbc53bffef	AliasAnalysisKind::CONSERVATIVE/FROM_SCHEMA (#22175 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22175 - Rename AliasAnalysisKind::DEFAULT to AliasAnalysisKind::CONSERVATIVE - Introduce AliasAnalysisKind::FROM_SCHEMA that means the alias annotations of the schema should be honored - Introduce AliasAnalysisKind::INTERNAL_SPECIAL_CASE to be able to run assertions that internal special cased ops are treated correctly - aten:: and prim:: ops are not treated as special cases anymore, but just use AliasAnalysisKind::FROM_SCHEMA - There's a set of assertions to ensure that aten:: and prim:: ops are all correctly set up to use AliasAnalysisKind::FROM_SCHEMA. Once this PR lands and passes all tests, we will remove those assertions and open up for the possibility of different AliasAnalysisKind settings for aten:: and prim:: ops Differential Revision: D15929595 fbshipit-source-id: 7c6a9d4d29e13b8c9a856062cd6fb3f8a46a2e0d	2019-07-25 11:53:51 -07:00
Bram Wasti	82ceeaeca2	Add options to jit's operator constructor (#21315 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21315 ghimport-source-id: 168ddecb333a8cb309e7b859683de9b077123205 Differential Revision: D15614506 Pulled By: bwasti fbshipit-source-id: ae013a88e2069c38845b5b8ff805db96ab2c29e9	2019-06-03 19:30:22 -07:00
Sebastian Messmer	384d828ea5	Add aliasAnalysis to torch::RegisterOperators() (#21084 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21084 - Now AliasAnalysisKind can be set using the torch::RegisterOperators() API - This also allows us to remove the last place in torch::jit::RegisterOperators that didn't use c10 yet. Reviewed By: dzhulgakov Differential Revision: D15542097 fbshipit-source-id: ea127ecf051a5c1e567e035692deed44e04faa9e	2019-05-31 13:51:07 -07:00
Bram Wasti	035966d538	Add options to Operator to enable registration of alias analysis passes (#19382 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19382 ghimport-source-id: aeaad3b84ea20dd95b38635ca28c5ff657187909 Differential Revision: D14990873 Pulled By: bwasti fbshipit-source-id: e1292ac8358ca8ff5bad8d8aeaddf06c23e66067	2019-05-06 15:40:13 -07:00
Sebastian Messmer	094678c04b	Split function schema parser from operator (#19280 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19280 We want to use the function schema parser from ATen/core, but with as little dependencies as possible. This diff moves the function schema parser into its own file and removes some of its dependencies. Reviewed By: dzhulgakov Differential Revision: D14931651 fbshipit-source-id: c2d787202795ff034da8cba255b9f007e69b4aea	2019-04-17 23:50:15 -07:00
Michael Suo	242743eedb	Revert D14901379: [jit] Add options to Operator to enable registration of alias analysis passes Differential Revision: D14901379 Original commit changeset: d92a497e280f fbshipit-source-id: 51d31491ab90907a6c95af5d8a59dff5e5ed36a4	2019-04-17 16:56:14 -07:00
Bram Wasti	3a031c414a	Add options to Operator to enable registration of alias analysis passes (#18589 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18589 ghimport-source-id: dab203f6be13bf41963848f5315235b6bbe45c08 Differential Revision: D14901379 Pulled By: bwasti fbshipit-source-id: d92a497e280f1b0a63b11a9fd8ae9b48bf52e6bf	2019-04-17 13:14:55 -07:00
Sebastian Messmer	be364ac8d7	Specify overload name in function schema (#18037 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18037 The FunctionSchema can now store an overload name and the parser knows how to parse it. Specify like this: my_func.overload1(arg1: Tensor) -> Tensor my_func.overload2(arg1: Tensor, arg2: Tensor) -> Tensor Reviewed By: zdevito Differential Revision: D14467497 fbshipit-source-id: 8832b32f07351bb61090357b17b77a6a2fed3650	2019-03-15 16:58:13 -07:00
Sebastian Messmer	16468a9f45	Automatically register c10 ops with JIT (#16534 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16534 All c10 ops from the c10 dispatcher are now automatically registered with JIT Reviewed By: dzhulgakov Differential Revision: D13869275 fbshipit-source-id: 5ab5dec5b983fe661f977f9d29d8036768cdcab6	2019-02-06 21:21:33 -08:00
Sebastian Messmer	ed4776820a	Fix includes for ATen/core/stack.h (#16462 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16462 This file was moved, now we change the includes to the new location and remove the proxy header. Reviewed By: ezyang Differential Revision: D13847279 fbshipit-source-id: 4617d52fdcfe785cb7b2154460a6686c437abd8b	2019-01-29 23:33:13 -08:00
Mikhail Zolotukhin	47bf30661f	Directly include headers from ATen. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16287 Differential Revision: D13792949 Pulled By: ZolotukhinM fbshipit-source-id: d627d8dc469df048063c70d0b5b8d33fede809a3	2019-01-24 11:22:27 -08:00
Elias Ellison	b1529eeadb	Print out operator suggestions for unknown builtin op (#15183 ) Summary: This improves the error message for "unknown builtin op" to suggest similarly named ops. Currently it prints out all operators with a name within two edits. Related issue: https://github.com/pytorch/pytorch/issues/13409 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15183 Differential Revision: D13578509 Pulled By: eellison fbshipit-source-id: 5c73408eda1f7aa456f5bd28790c34df0c76aeca	2019-01-04 13:04:44 -08:00
Michael Suo	f636dc9276	clang format world (#15524 ) Summary: The PR clang-formats everything in `torch/csrc/jit/` and adds it to the pre-commit hook. Here is a list of non-mechanical changes: - I went over each file and fixed up whenever I could tell that clang-format was clobbering comment formatting. - Made the macros in register_prim_ops a little more clang-format friendly by omitting trailing commas - Refactored autodiff.cpp to use a helper class with explicit state rather than a bunch of capturing lambdas - Small improvements to the precommit hook clang-format Pull Request resolved: https://github.com/pytorch/pytorch/pull/15524 Differential Revision: D13547989 Pulled By: suo fbshipit-source-id: 3ff1541bb06433ccfe6de6e33f29227a2b5bb493	2018-12-26 06:55:01 -08:00
Ailing Zhang	6ab2e7442d	Autograd using torchscript (#14604 ) Summary: This PR enables autodiff to use the forward/backward graph compiled from python code, instead of using symbolic gradients(modifying the original graph directly). We put the map in a separate .h file for now to wait for the native_functions.yaml and derivatives.yaml merge. This should ideally go into native_functions.yaml eventually. This PR should be enough to unblock us for now, we can start writing gradients for aten functions in python. Differential Revision: D13494635 Pulled By: ailzhang fbshipit-source-id: f8d51a15243ac46afd09d930c573ccdfcd9fdaaf	2018-12-18 19:10:57 -08:00
Edward Yang	517c7c9861	Canonicalize all includes in PyTorch. (#14849 ) Summary: Anywhere we used #include "foo.h", we now say #include <foo.h> Paths are adjusted to be rooted out of aten/src, torch/lib, or the root level directory. I modified CMakeLists.txt by hand to remove TH and THC from the include paths. I used the following script to do the canonicalization: ``` import subprocess import re import os.path files = subprocess.check_output(['git', 'ls-files']).decode('utf-8').rstrip().split('\n') for fn in files: if not any(fn.endswith(suff) for suff in ['.cu', '.cpp', '.in', '.h', '.hpp', '.cu', '.cuh', '.cc']): continue if not any(fn.startswith(pref) for pref in ["aten/", "torch/"]): continue with open(fn, 'r') as f: c = f.read() def fmt(p): return "#include <{}>".format(p) def repl(m): p = m.group(1) if p in ["dlfcn.h", "unistd.h", "nvrtc.h", "cuda.h", "cuda_runtime.h", "cstdint", "cudnn.h", "Python.h", "cusparse.h", "cuda_runtime_api.h", "cuda_fp16.h", "cublas_v2.h", "stdint.h", "curand_kernel.h"]: return fmt(p) if any(p.startswith(pref) for pref in ["torch/csrc", "c10/", "ATen/", "caffe2/", "TH/", "THC/", "Eigen/", "gtest/", "zdl/", "gloo/", "onnx/", "miopen/"]): return fmt(p) for root in ["aten/src", "torch/lib", ""]: for bad_root in [os.path.dirname(fn), "aten/src/TH", "aten/src/THC", "torch/csrc"]: new_p = os.path.relpath(os.path.join(bad_root, p), root) if not new_p.startswith("../") and (os.path.exists(os.path.join(root, new_p)) or os.path.exists(os.path.join(root, new_p + ".in"))): return fmt(new_p) print("ERROR: ", fn, p) return m.group(0) new_c = re.sub(r'#include "([^"]+)"', repl, c) if new_c != c: print(fn) with open(fn, 'w') as f: f.write(new_c) ``` Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/14849 Reviewed By: dzhulgakov Differential Revision: D13363445 Pulled By: ezyang fbshipit-source-id: 52361f878a672785f9306c9e9ab2513128092b68	2018-12-08 19:38:30 -08:00
Zachary DeVito	aef9e76283	Get pretty printer ready for use as a serialization format (#13616 ) Summary: Get pretty printer ready for use as a serialization format This PR adds a bunch of functionality to the pretty printer (now called python_printer to reflect the fact that it will be used to output valid python source). The idea is to get the printer ready for use as serialization format. This PR does not have tests beyond what the pretty printer already had. PRs stacked on this one will do round-trip export/import to test this functionality more robustly. Notes: * PythonPrinter is an evolution of the original pretty printer. However, much of it has changed so it is best just to read it as a new implementation. Trying to correlate it to the original implementation is probably not much help. * The printer tries to get reasonably close to how the original function was likely written, such as writing expressions rather than making intermediates when possible. We may decide to turn this off for the actual serialization, but it is useful for pretty printing. * tensor field access was changed so that prim::device and family have schema * fixed a bug in the compiler where setUniqueName gets called even when a value already has one. this sometimes assigned really poor names to graph inputs * Graph::insert gains an optional range argument to make range-preserving inserts easier. * prim:: ops that can have schema now have schema. This is because when we parse them back in, we will need the schema to correctly set their output types. * there is code in the python printer to complain if you try to add a prim op and do not update the printer. * BuiltinModule is generalized to take an operator namespace and a version number for work in future commits. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13616 Reviewed By: goldsborough Differential Revision: D13008252 Pulled By: zdevito fbshipit-source-id: 32b33bc6410d6ca1c6f02bd6e050f8d5eea32083	2018-11-12 10:21:30 -08:00
mruberry	6fe089c6ea	Hierarchical device independent -> device specific architecture (#13108 ) Summary: This PR principally redesigns the fuser's logical flow to be hierarchical, with device-independent logic directing (relatively little) device-specific logic. This design is based on reviews of XLA, TVM, internal design review at NVIDIA and discussions with fuser owners at Facebook. To further vet the design I have begun developing the next significant PR (extended fusion logic) on top of this architecture and it has made the work significantly easier. This PR also improves fuser modularity, which should make it easier for others to contribute to. Unfortunately, this PR is large and its nature has made breaking it into smaller pieces challenging. Future PRs should be smaller. The fusion flow is now: - Fusions are "registered" and "upfront compilation" occurs. The fusion specifications, which includes the graph, go into a thread-safe device-independent cache. Upfront compilation generates some information used later during shape inference. - Fusions are run, which passes them to an executor that performs shape inference, requests an instantiated fusion from the specification's thread-safe store, and launches them. Launch logic eventually defers to device-specific logic. - Fusions not previously instantiated are compiled. Compilation is device-specific and arg-specific. Compilation logic eventually defers to device-specific logic. - If the fusion could not be run because fusion on the requested device is disabled or shape inference fails a fallback is invoked. This flow can be thought of as PyTorch IR -> Device-Independent Fusion Logic -> Device-Specific Fusion Logic. The current upstream logic is, by contrast, PyTorch IR -> Device-Specific Logic -> Device-Independent Logic, which results in needless code duplication and lack of conceptual clarity. That was my mistake when splitting the fuser off from the rest of the jit and our reviews since then have been incredibly helpful in understanding why the approach in this PR is better. This PR does not only move code around. It also fixes few couple bugs and makes some logical/code changes. Bug fixes: - thread-safety is improved with caches preventing concurrent access - the nvrtc version is now reviewed to determine the appropriate compute architecture to compile for, fixing a bug that would cause runtime errors if a user's nvrtc didn't support the compute architecture their gpu reported - an issue with DeviceGuard not setting the device properly and failing silently is worked-around (ezyang mentioned he was reviewing the dynamic registration DeviceGuard uses, which may resolve the issue) Code/Logical changes: - "const" now appears many more places (note: I cast const away in operator.h because of some obscure build issues -- I think we should be able to fix this and will take a look while this goes through testing) - The new flow allowed some redundant code to be removed (AnnotatedGraph is gone, for example, and the more straightforward flow eliminated duplication of effort elsewhere) - Fallback logic is now also invoked if a fusion is requested on a device that cannot handle fusions - Use of macros to determine which files are compiled is reduced (though they may come back if the Windows build is unhappy) - There is no more "common" code or folder, the device-independent logic being at the forefront of the fuser replaces and improves upon the goal of sharing code apaszke who I promised naming rights to zdevito who correctly pointed out that the device-independent logic should be the bulk of what the fuser is doing ngimel who contributed to the design of this architecture Pull Request resolved: https://github.com/pytorch/pytorch/pull/13108 Reviewed By: gchanan, fmassa Differential Revision: D12850608 Pulled By: soumith fbshipit-source-id: 24e2df6dfa97591ee36aeca8944519678c301fa3	2018-10-31 18:13:00 -07:00
Peter Goldsborough	9ea19cb079	Windows CI integration for custom ops (#12928 ) Summary: Resubmission of https://github.com/pytorch/pytorch/pull/11527 ezyang orionr Pull Request resolved: https://github.com/pytorch/pytorch/pull/12928 Differential Revision: D10501342 Pulled By: goldsborough fbshipit-source-id: 7ce74795aab2f13efeb38f56ce82f53055f5eade	2018-10-23 09:18:09 -07:00
Will Feng	9473e57eca	Revert D10444104: [pytorch][PR] Windows CI integration for custom ops Differential Revision: D10444104 Original commit changeset: 4c447beeb967 fbshipit-source-id: ead52444aefa27692e3f36dadad986e2313261bd	2018-10-18 14:08:18 -07:00
Peter Goldsborough	12be60cc04	Windows CI integration for custom ops (#11527 ) Summary: This is likely currently broken due to symbol visibility issues, but we will investigate it using this PR. CC orionr yf225 Pull Request resolved: https://github.com/pytorch/pytorch/pull/11527 Differential Revision: D10444104 Pulled By: goldsborough fbshipit-source-id: 4c447beeb9671598ecfc846cb5c507ef143459fe	2018-10-18 07:55:05 -07:00
Yangqing Jia	713e706618	Move exception to C10 (#12354 ) Summary: There are still a few work to be done: - Move logging and unify AT_WARN with LOG(ERROR). - A few header files are still being plumbed through, need cleaning. - caffe2::EnforceNotMet aliasing is not done yet. - need to unify the macros. See c10/util/Exception.h This is mainly a codemod and not causing functional changes. If you find your job failing and trace back to this diff, usually it can be fixed by the following approaches: (1) add //caffe2/c10:c10 to your dependency (or transitive dependency). (2) change objects such as at::Error, at::Optional to the c10 namespace. (3) change functions to the c10 namespace. Especially, caffe2::MakeString is not overridden by the unified c10::str function. Nothing else changes. Please kindly consider not reverting this diff - it involves multiple rounds of rebasing and the fix is usually simple. Contact jiayq@ or AI Platform Dev for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12354 Reviewed By: orionr Differential Revision: D10238910 Pulled By: Yangqing fbshipit-source-id: 7794d5bf2797ab0ca6ebaccaa2f7ebbd50ff8f32	2018-10-15 13:33:18 -07:00
James Reed	beeec47041	Sanity checks for tracing (#10841 ) Summary: TODO: integrate into torch.onnx.export -- separate PR Problem: We have a facility to trace PyTorch operations on Python code, but there are several failure modes where the trace is not representative of the actual underlying computation: * The tracer encountered dynamic control flow * Some computation escaped the tracer, and appeared as a Constant tensor node in the graph * Some stateful function was traced, e.g. someone did an optimization in Python by memoizing function outputs Objective: In an ideal world, this whole process would be automated and the user can trust that the system will magically capture the intended semantics from the program. Realistically speaking, we will likely have to settle with a human-in-the-loop error reporting system, allowing for the user to identify problems and modify the source code to allow for tracing. Stage 1 (this PR): Output-level checking & graph diff. torch.jit.trace gains a kwarg 'check_inputs', which is a list of tuples of input arguments. We will iterate through the list and trace the function again for each set of check inputs. We'll also interpret the original trace with these inputs and compare output values and graphs, printing a diff of the graph if there is a difference. Examples: ``` torch.jit.trace(torch.rand(3, 4), check_inputs=[(torch.rand(4, 5),)]) def foo(x): y = torch.arange(0, x.shape[0]).float() return x + y.unsqueeze(1) ``` ``` torch.jit.TracingCheckError: Tracing failed sanity checks! ERROR: Graphs differed across invocations! Graph diff: graph(%0 : Dynamic) { - %1 : Dynamic = prim::Constant[value= 0 1 2 [ CPULongType{3} ]]() ? ^ + %1 : Dynamic = prim::Constant[value= 0 1 2 3 [ CPULongType{4} ]]() ? +++ ^ %2 : int = prim::Constant[value=0]() %3 : Dynamic = aten::_cast_Float(%1, %2) %4 : int = prim::Constant[value=1]() %5 : Dynamic = aten::unsqueeze(%3, %4) %6 : int = prim::Constant[value=1]() %7 : Dynamic = aten::add(%0, %5, %6) return (%7); } Node diff: - %1 : Dynamic = prim::Constant[value= 0 1 2 [ CPULongType{3} ]]() ? ^ + %1 : Dynamic = prim::Constant[value= 0 1 2 3 [ CPULongType{4} ]]() ? +++ ^ Trace source location: dank.py(5): foo /Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(402): wrapper dank.py(3): <module> Check source location: dank.py(5): foo /Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(281): check_trace /Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(408): wrapper dank.py(3): <module> ERROR: Tensor-valued Constant nodes differed in value across invocations. This often indicates that the tracer has encountered untraceable code. Node: %1 : Dynamic = prim::Constant[value= 0 1 2 [ CPULongType{3} ]]() Source Location: dank.py(5): foo /Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(402): wrapper dank.py(3): <module> Comparison exception: Not equal to tolerance rtol=1e-07, atol=0 (shapes (3,), (4,) mismatch) x: array([0, 1, 2]) y: array([0, 1, 2, 3]) ``` == ``` torch.jit.trace(torch.rand(3, 4), check_inputs=[(torch.rand(3, 4),)]) def foo(x): y = x.data return x + y ``` ``` torch.jit.TracingCheckError: Tracing failed sanity checks! ERROR: Traced function outputs do not match the Python function outputs. ERROR: Tensor-valued Constant nodes differed in value across invocations. This often indicates that the tracer has encountered untraceable code. Node: %1 : Dynamic = prim::Constant[value=<Tensor>]() Source Location: dank.py(6): foo /Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(402): wrapper dank.py(3): <module> Comparison exception: Not equal to tolerance rtol=1e-07, atol=0 (mismatch 100.0%) x: array([0.397137, 0.956105, 0.169478, 0.560292, 0.392568, 0.108441, 0.97645 , 0.34412 , 0.951246, 0.793061, 0.557595, 0.770245], dtype=float32) y: array([0.243178, 0.315964, 0.972041, 0.0215 , 0.927751, 0.457512, 0.951092, 0.97883 , 0.048688, 0.118066, 0.779345, 0.271272], dtype=float32) ``` == ``` import torch torch.jit.trace(torch.rand(3, 4), check_inputs=[(torch.rand(4, 4),)]) def foo(x): for _ in range(x.size(0)): x = torch.neg(x) return x ``` ``` torch.jit.TracingCheckError: Tracing failed sanity checks! ERROR: Traced function outputs do not match the Python function outputs. ERROR: Graphs differed across invocations! Graph diff: graph(%0 : Dynamic) { %1 : Dynamic = aten::neg(%0) %2 : Dynamic = aten::neg(%1) %3 : Dynamic = aten::neg(%2) + %4 : Dynamic = aten::neg(%3) - return (%3); ? ^ + return (%4); ? ^ } ``` == ``` import torch def foo(x): if not hasattr(foo, 'cache'): foo.cache = torch.neg(x) return x + foo.cache traced = torch.jit.trace(torch.rand(3, 4), check_inputs=[(torch.rand(3, 4),)])(foo) ``` ``` torch.jit.TracingCheckError: Tracing failed sanity checks! ERROR: Traced function outputs do not match the Python function outputs. ERROR: Graphs differed across invocations! Graph diff: graph(%0 : Dynamic) { - %1 : Dynamic = aten::neg(%0) + %1 : Dynamic = prim::Constant[value=<Tensor>]() %2 : int = prim::Constant[value=1]() %3 : Dynamic = aten::add(%0, %1, %2) return (%3); } Node diff: - %1 : Dynamic = aten::neg(%0) + %1 : Dynamic = prim::Constant[value=<Tensor>]() Trace source location: test.py(5): foo /Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(402): wrapper test.py(8): <module> Check source location: test.py(6): foo /Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(281): check_trace /Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(408): wrapper test.py(8): <module> ``` The following two examples show instances where program semantics are lost in the Python -> trace transformation, and repeated invocation does not give us useful debug information. Further design in underway for catching these scenarios. ``` import torch torch.jit.trace(torch.rand(3, 4), check_inputs=[(torch.rand(3, 4),)]) def foo(x): for i in range(3): x[i, :] = torch.zeros(4) return x ``` ``` torch.jit.TracingCheckError: Tracing failed sanity checks! ERROR: Traced function outputs do not match the Python function outputs. Exception: Not equal to tolerance rtol=1e-07, atol=0 (mismatch 100.0%) x: array([0.830221, 0.915481, 0.940281, 0.555241], dtype=float32) y: array([0., 0., 0., 0.], dtype=float32) ``` == ``` import torch torch.jit.trace(torch.rand(3, 4), check_inputs=[(torch.rand(5, 6),)]) def foo(x): x.view(-1).add_(-x.view(-1)) return x ``` ``` torch.jit.TracingCheckError: Tracing failed sanity checks! ERROR: Traced function outputs do not match the Python function outputs. Exception: Not equal to tolerance rtol=1e-07, atol=0 (mismatch 100.0%) x: array([0.734441, 0.445327, 0.640592, 0.30076 , 0.891674, 0.124771], dtype=float32) y: array([0., 0., 0., 0., 0., 0.], dtype=float32) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/10841 Differential Revision: D9499945 Pulled By: jamesr66a fbshipit-source-id: 1f842a32d0b0645259cc43b29700b86d99c59a45	2018-08-28 20:25:26 -07:00
Peter Goldsborough	c101a57a74	Build mechanism for custom operators (#10226 ) Summary: This is the last step in the custom operator implementation: providing a way to build from C++ and Python. For this I: 1. Created a `FindTorch.cmake` taken largely from ebetica with a CMake function to easily create simple custom op libraries 2. Created a ` torch/op.h` header for easy inclusion of necessary headers, 3. Created a test directory `pytorch/test/custom_operator` which includes the basic setup for a custom op. 1. It defines an op in `op.{h,cpp}` 2. Registers it with the JIT using `RegisterOperators` 3. Builds it into a shared library via a `CMakeLists.txt` 4. Binds it into Python using a `setup.py`. This step makes use of our C++ extension setup that we already have. No work, yey! The pure C++ and the Python builds are separate and not coupled in any way. zdevito soumith dzhulgakov Pull Request resolved: https://github.com/pytorch/pytorch/pull/10226 Differential Revision: D9296839 Pulled By: goldsborough fbshipit-source-id: 32f74cafb6e3d86cada8dfca8136d0dfb1f197a0	2018-08-16 18:56:17 -07:00
Peter Goldsborough	0c848f4179	Python integration for custom operators (#10149 ) Summary: Adds the Python path to custom operators, including dynamically loading operations into Python. zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10149 Reviewed By: ezyang Differential Revision: D9158380 Pulled By: goldsborough fbshipit-source-id: 3edffa639e8d2959e9e80d1bd4f20ab4a1b3ca02	2018-08-06 13:54:48 -07:00
Peter Goldsborough	87d57dc5f5	Simplified Operator (#10080 ) Summary: zdevito explained that the attributed versions of `Operator`s are no longer necessary. This PR does two things: 1. Removes all code associated with attributed operators, 2. Adds a second kind of state to `Operator` where it is constructed with an `Operation` directly instead of an `OperationCreator`. This will be useful to test custom operators which don't require a node (you can just retrieve it directly). Now rebased on top of https://github.com/pytorch/pytorch/pull/9801 zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10080 Differential Revision: D9113668 Pulled By: goldsborough fbshipit-source-id: 1276a191c7cf89da1c38488769f2105ce2664750	2018-08-01 09:41:08 -07:00
Zachary DeVito	bf744bea94	Parse and register schema declarations lazily (#9801 ) Summary: This should prevent slow startup times, and will not report as many errors during static initialization time which are hard to debug ezyang apaszke Pull Request resolved: https://github.com/pytorch/pytorch/pull/9801 Reviewed By: goldsborough Differential Revision: D8986603 Pulled By: zdevito fbshipit-source-id: 440d43ab5e8cffe0b15118cb5fda36391ed06dbc	2018-07-31 17:24:24 -07:00
Peter Goldsborough	04939a4745	Match parameter names and = default (#9737 ) Summary: More clang tidy cleanups in `torch/csrc`. This time: 1. `hicpp-use-equals-default` recommends `= default` instead of `{}` for constructors/destructors. This is better practice because it expresses the intent better (https://stackoverflow.com/questions/6502828/what-does-default-mean-after-a-class-function-declaration) 2. `readability-inconsistent-declaration-parameter-name` enforces that parameter names in the declaration match parameter names in the definition. This is just generally useful and can prevent confusion and bugs. Also updated my script a little bit. apaszke ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/9737 Differential Revision: D9069069 Pulled By: goldsborough fbshipit-source-id: f7b3f3a4eb4c9fadc30425a153566d3b613a41ae	2018-07-30 14:10:00 -07:00
Adam Paszke	8cb1eef7b9	Unify IR operator representation (stop using attributes in the JIT) (#9807 ) Summary: Based on top of #9763 (first 3 commits belong to that PR). The first commits from this PR are "Stop using attributes ..." I tried to separate the changes into fairly meaningful commits. I can't split them up into smaller PRs, because everything starts working and all tests pass only after the whole sequence, but hopefully this will make reviewing somewhat easier. Known issues/regressions/future tasks: - `aten::lerp` and `aten::clamp` are no longer fusable - `CreateAutodiffSubgraphs` needs a rewrite - It is much more strict now, and will miss a lot of opportunities, especially when viewing ops are involved. Our previous approach was "ignore the assumption on shape availability in gradient formulas to determine differentiability, and hope that shape prop will be robust enough to actually deliver them before we differentiate", which obviously doesn't scale well to more complex cases. We should either work on reducing the size dependency of grad formulas (feasible e.g. for `view`/`reshape`, unfeasible for `squeeze`/`unsqueeze`), or make `CreateAutodiffSubgraphs` integrate some kind of "I could integrate this node into an AD subgraph, but will I be able to infer the shape of its input" reasoning (kind of like a limited shape prop, that doesn't infer anything, and only tells if it could infer something). - It sometimes creates constant-only (or constants + one node) graphs, which is useless - Broken `aten::add` in auto-batching, because it gained a non-tensor input. I changed the test for pointwise operations to use `aten::mul` instead, but I needed to disable the LSTM cell test. I'm not sure how scalar constants should be implemented in this case, because I don't fully understand our format. cc: ChunliF - Graph import does some hacks to recover type of constants. This code should be removed once we'll gain the ability to export the IR along with value types. - There's still a fair amount of dead code that can be removed. I didn't want to make this diff any bigger, and removing it is an easy task. - Graph fuser could be improved to use signature matching (possibly using `OperatorSet`) instead of basing on node kinds. - Manual constant propagation for the `ListConstruct` node in `torch/onnx/utils.py` should be replaced with a proper constant propagation pass (or we should ensure that the one we have handles at least this case before we remove this code). zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/9807 Reviewed By: ezyang Differential Revision: D9004285 Pulled By: apaszke fbshipit-source-id: fe88026a765f6b687354add034c86402362508b7	2018-07-26 22:11:50 -07:00
Peter Goldsborough	f62bc01dfe	Remove TORCH_ASSERT (#9575 ) Summary: I got some tensor->variable conversion exceptions from `torch/csrc/autograd/variable.h`, which used the `TORCH_ASSERTM` macros instead of `AT_CHECK`, so they didn't have backtraces. This was such a substantial loss for debugability that I decided to update the whole codebase to use the backtrace-enabled ATen macros instead of `TORCH_ASSERT` and `JIT_ASSERT`, the latter having been an alias of the former. ezyang apaszke zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/9575 Differential Revision: D8924566 Pulled By: goldsborough fbshipit-source-id: 7a4013b13eec9dbf024cef94cf49fca72f61d441	2018-07-24 18:10:06 -07:00
Zachary DeVito	a949245a86	Switch interpreter to use IValue's primitive int/floats (#9718 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9718 This patch switches the interpreter to use IValue's primitive numbers rather than tensors for computing on integers and floats. In addition to preparing the interpreter for first-class support of other types, this cleans up the handling of primitive numbers, making it possible to just use the normal operator overloading dispatch to find the right implementation for numbers. As a result of this change, a lot of other functionality needed to be updated since it was the first time we use non-tensors in a lot of places in the code base. Notes: * Fixes code_template.py so that multi-line strings are indented correctly when used on a standalone line * Cast operators (`int(x)`) now are functional. Some tests have addition conversions to integers because we no longer allow implicit tensor -> integer conversions following the same convention as in python * prim::ListConstruct/createList has been added to the interpreter for creating lists and this has replaced aten::stack for integers lists * gen_jit_dispatch.py has been refactored so that non-tensor types use operators on IValues to extract the primitives * IValue gains a .to<T> method that is the equivalent of tensor_as but for IValue instead of at::Tensor * `constant_as<T>` is switched over to using IValues's `.to<T>` method, to make conversion from constant->IValue->C++ type more consistent. This functionality combined with `toIValue(Value)` replaces the `tensor_as` and `as_tensor` family of functions. conditional expressions (if, loop) and operators related to them are now computed on integers rather than tensors * IValue gains constructors for constructing from at::Scalar and converting to it. However, IValue itself will always store the scalars as a double or int64. * To align with python 3 syntax, TK_INT, TK_FLOAT, and TK_BOOL have been removed from the parser, and int/float/bool are just treated as special identifiers in the compiler, along with print. These are represented as special sugared values with a `call` method implemented. For int/float/bool this implements casting behavior. * Dropped shared_from_this from Type/Module. They were not needed and they making debugging harder because they internally throw/catch exceptions. * Shape propagation has been updated to support running nodes that include floating point primitive types, this required some refactoring of internal functions. * TensorToNum and NumToTensor have actual implementations as operators now * regster_prim_ops now contains implementations of math operators for float/int primitive types, and for mixed (prim <+> tensor) versions. This removes the need for special handling in compiler.cpp * Primitive math is now entirely handled by letting the compiler choose the right overloads. This removes tons of special casing in the compiler. * incorporates eellison's change to allow casting from return values. Due to the addition of primitive support, the code need slight modifications, so I just pre-merged it here. * stack.h gains generic vararg versions of push/pop that know how to convert to/from C++ types: ``` at::Tensor a; at::Scalar b; pop(stack, a, b); at::Tensor c = a + b; push(stack, c); ``` apaszke Pull Request resolved: https://github.com/pytorch/pytorch/pull/9584 Reviewed By: apaszke Differential Revision: D8910546 Pulled By: zdevito fbshipit-source-id: 0f3e60d4d22217f196a8f606549430e43b7e7e30	2018-07-23 14:11:11 -07:00
peter	53083b8353	Remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS and fix CUDA 8 build on Windows (#9491 ) (#9491 ) Summary: Fixes #9092. Pull Request resolved: https://github.com/pytorch/pytorch/pull/9491 Pull Request resolved: https://github.com/pytorch/pytorch/pull/9693 Differential Revision: D8946850 Pulled By: ezyang fbshipit-source-id: bd816f459ab70f6b4a0983305a1ce341bb633707	2018-07-23 06:40:39 -07:00
Adam Paszke	769cb5a640	Add new ways of matching nodes with schemas in the JIT (#9567 ) Summary: REVIEW LAST COMMIT ONLY As discussed in our yesterday's meeting. Nodes can be now matched to particular overloads using the `matches(...)` function: ```cpp n->matches("aten::type_as(Tensor self, Tensor other) -> Tensor") ``` This also changes the shape prop and peephole passes to use those functions for matching. This fixes a few bugs, makes them much more robust, and prepares us for removal of attributes. zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/9567 Reviewed By: zdevito Differential Revision: D8938482 Pulled By: apaszke fbshipit-source-id: eb2382eeeae99692aada2d78d5d0c87c8ef1545e	2018-07-21 21:39:07 -07:00
Zachary DeVito	efefd1d7cf	Unify aten_dispatch and aten_schema into a single operator abstraction with human-readable schema. (#8885 ) Summary: This is a series of two commits that should probably be read separately. They are stacked on top of #9018 since the second commit requires it for correctness. Commit 1 ======= This commit is the first in a series that will clean up how we handle declaring operators and intrinsics in the JIT to make it more modular and readable. This introduces readable declarations that can be used to register operators and switches gen_jit_dispatch to generate this schema. A follow up PR will remove the dispatch keys like "add-3" and resolve ops directly based on the registered schema, further simplifying the generation process. * Switches schema over to parsed declarations, in the future this will allow something like: ``` registry.register_intrinsic("foo(Tensor a, Tensor b) -> Tensor", [](Stack& stack) { ... }) ``` This will allow the scalable registration of intrinsics for lists, tuples, and other ops, as long as meta-data for these ops (e.g. derivatives and size propagation routines). The declarations resemble those used by PythonArgParser but have been singificantly cleaned up to minimize the number of types that can appear in the declaration. We should strive to get the other parts of PyTorch switched over to this restricted declaration set when possible, but it is too much to do in a single PR. My hope is that eventually we will use a very similar language to describe declarations in C10, and this can serve as a guide for that. Parsing is done using the script lexer, so it is very robust to whitespace and extensible for future types. This removes the other way we encoded schema, and makes it easier to see what schema are registered. Current generated declarations: https://gist.github.com/zdevito/a96a17766fb3a098d69a91ee00abaaf6 * Switches how we handle attempting to use an integer in the place of a fixed-sized int list, such as in conv (e.g. 'int[3] stride=1'). Now that we can statically distinguish between int and Tensor, we handle the expansion as an implicit conversion in the compiler. This allows us to simplify the interpreter since it no longer needs to handle the conversion itself. * Schema declarations have been changed so that they match the type system in the IR exactly. In particular, attribute_info which was used by liftConstantAttributes has been dropped and constant attributes are lifted purely based on the type of the input. Type conversions in compiler have been simplified due to this change. * Error highlighting in ErrorReport now only reports at most 20 lines of code, to make reading where an error occurred easier. Commit 2 ======= This commit unifies aten_dispatch and aten_schema into a single Operator object that both contains schema and implementation information. In the future we can use this object to also contain functionality like shape prop and autodiff needed by all operators. Operators are registered globally, and dispatch logic uses the schema information to figure out which variant to use. Descriptor keys, a frequent source of inscrutable debug errors, have been removed. * Introduce Operator, to replace TensorOp. Unlike TensorOp, we use Operator for all op implementations, including primitives that may occur in the graphs. The only exceptions are ops that are only known to the interpreter like jumps, and GraphExecutors where we need to record additional debug info. * Adds a global registry for Operator implementations. aten_dispatch.cpp turns into register_aten_ops.cpp, which registers all the Operators for aten with the operator registry. register_prim_ops.cpp now contains the implementations for primitive operators that used to be in the interpreter. This means that it is now safe to use `getOperation(node)` to lookup the true interpreter function for the node, which will simplify const-propagation passes. * Remove addInterpreterOpHandler in favor of global operator registry. * Instead of descriptors, we match Node arguments directly against FunctionSchema describing expected inputs in `matchSchema`. `matchSchema` knows how parse both attributes and positional inputs from a node and match it to the appropriate registered operator. Debug error messages when we try to run an invalid operator are significantly improved: they now automatically display the schema for the op with the same name that are registered. * Merge aten_schema into regsiter_aten_ops. Each Operator takes a string schema which is parsed to determine when to dispatch to that op. * Cleans up gen_jit_dispatch.py now that we do not need to write out descriptors. In particular, skip_scalar_overloads can be removed since Richard's code sorts declarations to put Tensor, Tensor declarations first. * remove matchSchemaAndLiftConstantAttributes and use emitBuiltinCall instead to remove code duplication * refactor stack manipulation functions into a separate header file. Pull Request resolved: https://github.com/pytorch/pytorch/pull/8885 Reviewed By: jamesr66a Differential Revision: D8751048 Pulled By: zdevito fbshipit-source-id: 312aabfbf88307c5f6ab947b6caf691468b94557	2018-07-10 10:24:48 -07:00

33 Commits