pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Sebastian Messmer	b527e48588	Use c10::List (#21177 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21177 - Integrate c10::ListPtr into IValue and the c10 dispatcher. - Streamline conversion to/from IValue. Before, we had IValue::to<> and kernel_functor.h had its own ivalue_to_arg_type and return_type_to_ivalue. They are now unified. Also, this means that nested types like Dicts of Lists of Optional of Dict of ... do work as expected now Differential Revision: D15476433 fbshipit-source-id: bde9df80df20091aa8e6ae17ba7e90abd149b954	2019-06-12 13:58:24 -07:00
Michael Suo	cab3e726df	Split out Function into its own file (#21539 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21539 ghimport-source-id: f1e4396a0bec6e30d3179f926ec4da68807942f7 Differential Revision: D15741979 Pulled By: suo fbshipit-source-id: 4cd0ed36bcbf8db0b36a101dda6f58975f806889	2019-06-10 16:37:58 -07:00
Zachary DeVito	ea822d9626	Interpreter support for CallFunction/CallMethod (#21562 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21562 ghimport-source-id: 17e5e183f730f50d97ef48973aafc6249d54978f Reviewed By: suo Differential Revision: D15729500 Pulled By: zdevito fbshipit-source-id: efa8a133b617b1498810392a8da6b513ce00b5eb	2019-06-09 15:28:26 -07:00
Zachary DeVito	18996a8952	unfinished push/pop reduction (#21559 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21559 ghimport-source-id: 81ba4a5638577781e1ea706599966c033c37e814 Reviewed By: suo Differential Revision: D15729501 Pulled By: zdevito fbshipit-source-id: 3423bff61e89617c40078d5fab726b77d21bfa27	2019-06-09 15:28:16 -07:00
Zachary DeVito	13edda417d	Prepare interpreter for function calling (#21558 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21558 ghimport-source-id: a8a19dbefea869ca1401e5afea6c02f31f95b99a Reviewed By: suo Differential Revision: D15729491 Pulled By: zdevito fbshipit-source-id: 9629664608a2379a2ddcafaf741fa8463c4fb917	2019-06-09 15:28:13 -07:00
Zachary DeVito	d71501259b	Revert D15572818: Prepare interpreter for function calling Differential Revision: D15572818 Original commit changeset: 3a9b5f053664 fbshipit-source-id: b932411e8e88c7414c8db332d6049fe4e26bd83e	2019-06-07 22:20:54 -07:00
Zachary DeVito	d4bcab0dba	Revert D15590900: Reduce number of stack manipulation instructions in interpreter. Differential Revision: D15590900 Original commit changeset: 98829979feba fbshipit-source-id: eb7f1d396bb2b98d2852af81c69db81430eba33c	2019-06-07 22:20:50 -07:00
Zachary DeVito	bfb235b8c9	Revert D15618275: Interpreter support for CallFunction/CallMethod Differential Revision: D15618275 Original commit changeset: 038ae27e5416 fbshipit-source-id: 8dbe0f564ba103fe445dacc471085c659171705f	2019-06-07 22:20:40 -07:00
Zachary DeVito	5f6afafdef	Interpreter support for CallFunction/CallMethod (#21325 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21325 ghimport-source-id: eeca1176f5e00c85a69cd016acccf5105e670e02 Reviewed By: jamesr66a Differential Revision: D15618275 Pulled By: zdevito fbshipit-source-id: 038ae27e5416f1ce338009627c839a4d61a00658	2019-06-07 20:56:58 -07:00
Zachary DeVito	dde27958dd	Reduce number of stack manipulation instructions in interpreter. (#21240 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21240 ghimport-source-id: 5e9cbe8b3df3ac721135d2f652a420ae0b14ac55 Reviewed By: jamesr66a Differential Revision: D15590900 Pulled By: zdevito fbshipit-source-id: 98829979feba23685f0ba98ba3cb840157f7259a	2019-06-07 20:56:49 -07:00
Zachary DeVito	c53e4d012d	Prepare interpreter for function calling (#21185 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21185 ghimport-source-id: 6b9cb92d1f1f59bb980dcfa0d29dfe985ee955d1 Reviewed By: jamesr66a Differential Revision: D15572818 Pulled By: zdevito fbshipit-source-id: 3a9b5f053664c09212b97f1391d8d006337b5550	2019-06-07 20:56:46 -07:00
Ilia Cherniavskii	409200df59	Move inter-op settings into ATen/Parallel (#20050 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20050 ghimport-source-id: cc102bab8abf3e56c099245976786317ed63ea14 Differential Revision: D15248576 Pulled By: ilia-cher fbshipit-source-id: 55ddcb7af387ddfc68a42ac7167de07ea648e249	2019-05-17 03:12:02 -07:00
Edward Yang	97e1f07ffc	Replace AT_CHECK with TORCH_CHECK [shard 10/10] Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20436 Reviewed By: jerryzh168 Differential Revision: D15318926 fbshipit-source-id: 71a43070cc50cc174f703ebc595f1d87c6fc1e91	2019-05-15 07:35:37 -07:00
Zachary DeVito	3afd99680c	Remove SourceLocation (respin) (#20333 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20333 ghimport-source-id: e64075bb82067224463e9955d10bd13967d1975d Differential Revision: D15284081 Pulled By: zdevito fbshipit-source-id: ac26ae48392b9daff08f460529c06af8f4e4722a	2019-05-09 16:17:33 -07:00
Wanchao Liang	e870b11ae6	Revert D15275731: Remote SourceLocation Differential Revision: D15275731 Original commit changeset: f4da178c3137 fbshipit-source-id: 830b79735eb2dadc4795b5aae407826bf20ef121	2019-05-09 13:07:11 -07:00
Zachary DeVito	eca91de5d2	Remote SourceLocation (#20300 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20300 ghimport-source-id: 06f606c4db3b70b1d2ed9f6ed4542c3f703c4e17 Differential Revision: D15275731 Pulled By: zdevito fbshipit-source-id: f4da178c31372c2264feb9f99476b9c9aa66c1f2	2019-05-09 11:48:29 -07:00
Mikhail Zolotukhin	8b46938355	Cleanup includes in torch/csrc/jit/* (#19922 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19922 ghimport-source-id: 0434c46bf75621ff79ea27a18a2475e7f13e2487 Differential Revision: D15125015 Pulled By: ZolotukhinM fbshipit-source-id: 5685edfc94067f62e363a85e9badb7f757b1d321	2019-05-06 13:40:26 -07:00
Gregory Chanan	043e363c6c	Cache device on TensorImpl; clean up TensorImpl constructors. (#18833 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18833 ghimport-source-id: 6f2be25fcc5e6be3ffe20582e604bd2c1fbab66b Stack from [ghstack](https://github.com/ezyang/ghstack): * #18833 [STACK] Cache device on TensorImpl; clean up TensorImpl constructors. * #18832 [STACK] Disallow changing the device of a tensor via set_. * #18831 [STACK] Stop swapping in Storages of the wrong device for Tensors. 1) We cache device on TensorImpl. This means we can access the device without a virtual function and allows us to more easily extend TensorImpls (because they don't need to figure out how to store the Device for themselves). 2) Clean up TensorImpl APIs. We had a constructor that took a TensorTypeId and an allocator and would allocate a Storage based on the recognized types of TensorTypeIds. Instead, we just have two different constructors: one for types with a storage, one without. Reviewed By: dzhulgakov Differential Revision: D14766230 fbshipit-source-id: 745b8db84dcd6cb58f1a8675ad3ff8d033bc50df	2019-04-05 07:21:39 -07:00
James Reed	85f36014e2	Experimental logging/counters API (#18235 ) Summary: This defines a generic counters API that users can utilize to provide monitoring functionality in e.g. a production service. We expose both counters for runtime internals as well as a TorchScript API to create user-defined counters. Synopsis of the API: - `torch/csrc/jit/script/logging.h` specifies the externally-facing API in C++ - `torch/jit/_logging.py` specifies the Python API We use an interface, `LoggerBase`, to define the interactions between users and a logging backend. Implementing a subclass of `LoggerBase` allows the user to handle these events in a custom way, such as logging into a DB or calling into an infra-specific counters API. From the frontend perspective, we can create log events in two ways: 1. We provide an `add_stat_value(name, val)` function. This calls into the Logger backend with a key/value pair. For example, we might call `add_stat_value('foo', 1)` to bump an event counter. 2. We provide a `time_point()` function to record a timestamp in nanoseconds. This can be used in conjunction with `add_stat_value` to record runtime wall clock durations. Examples of frontend usage can be found in `test_jit.py TestLogging`. We provide a trivial `LockingLogger` implementation as an example and for testing purposes. It is likely not ready for production usage. It demonstrates that a backend implementing the API can do things like specify aggregation types and report these aggregate stats via the `get_counters()` API. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18235 Differential Revision: D14545060 Pulled By: jamesr66a fbshipit-source-id: 04099543a1898cfdd411511e46e03d5dce9b4881	2019-03-29 17:14:03 -07:00
James Reed	1d26a3ae7e	Open registration for c10 thread pool (#17788 ) Summary: 1. Move ATen threadpool & open registration mechanism to C10 2. Move the `global_work_queue` to use this open registration mechanism, to allow users to substitute in their own Pull Request resolved: https://github.com/pytorch/pytorch/pull/17788 Reviewed By: zdevito Differential Revision: D14379707 Pulled By: jamesr66a fbshipit-source-id: 949662d0024875abf09907d97db927f160c54d45	2019-03-08 15:38:41 -08:00
Edward Yang	4404762d7d	Rename IntList to IntArrayRef. (#16751 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16751 This was made more complicated by the fact that ivalue::IntList is a thing. So I had to fix all of the sites where we referring to IValue post facto. The following codemods were run, in this order: ``` codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in IntList IntArrayRef codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in IntArrayRef::create IntList::create codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in ivalue::IntArrayRef ivalue::IntList codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in Tag::IntArrayRef Tag::IntList codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in isIntArrayRef isIntList codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in toIntArrayRef toIntList codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in 'Shared<IntArrayRef>' 'Shared<IntList>' codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in 'intrusive_ptr<IntArrayRef>' 'intrusive_ptr<IntList>' ``` Some manual fixups were done afterwards; they can be reviewed separately at https://github.com/pytorch/pytorch/pull/16752 Reviewed By: dzhulgakov Differential Revision: D13954363 fbshipit-source-id: b5c40aacba042402155a2f5a229fa6db7992ac64	2019-02-05 14:54:34 -08:00
Zachary DeVito	c42431bd7a	Revert D13740752: [c10] plug caffe2 into jit Differential Revision: D13740752 Original commit changeset: 2d9383574d42 fbshipit-source-id: e9ff217a438720423340a10af7fa263b33f2ae24	2019-01-25 12:29:19 -08:00
Bram Wasti	6d2aee4a9b	plug caffe2 into jit (#16331 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16331 Temporary measure to enable caffe2 ops in pytorch Reviewed By: smessmer Differential Revision: D13740752 fbshipit-source-id: 2d9383574d42ce84ee471aba32eeb4f5a0cc7a4c	2019-01-24 22:28:21 -08:00
Mikhail Zolotukhin	47bf30661f	Directly include headers from ATen. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16287 Differential Revision: D13792949 Pulled By: ZolotukhinM fbshipit-source-id: d627d8dc469df048063c70d0b5b8d33fede809a3	2019-01-24 11:22:27 -08:00
Sebastian Messmer	0ab8de3125	Remove some dependencies from ivalue.h to ATen (#15855 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15855 This is preparation work for moving IValue to c10. Reviewed By: ezyang Differential Revision: D13605259 fbshipit-source-id: cc545f582ab8607bb02aaf71273cb2710200b295	2019-01-17 16:03:58 -08:00
Guoqiang Jerry Chen	6641b09fac	respect grad guard for torch.jit._fork and torch.jit._wait (#16101 ) Summary: respect grad guard for torch.jit._fork and torch.jit._wait. Verified that the test failed without the fix, and pass with the fix. Ideally I would like to enable and disable grad inside the forked function. It doesn't seems like it's supported at this moment. This code handles that as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16101 Differential Revision: D13708374 Pulled By: gqchen fbshipit-source-id: 0533f080c4d0253fb4c61d2a0d3cc22de5721a09	2019-01-17 11:12:57 -08:00
Michael Suo	f636dc9276	clang format world (#15524 ) Summary: The PR clang-formats everything in `torch/csrc/jit/` and adds it to the pre-commit hook. Here is a list of non-mechanical changes: - I went over each file and fixed up whenever I could tell that clang-format was clobbering comment formatting. - Made the macros in register_prim_ops a little more clang-format friendly by omitting trailing commas - Refactored autodiff.cpp to use a helper class with explicit state rather than a bunch of capturing lambdas - Small improvements to the precommit hook clang-format Pull Request resolved: https://github.com/pytorch/pytorch/pull/15524 Differential Revision: D13547989 Pulled By: suo fbshipit-source-id: 3ff1541bb06433ccfe6de6e33f29227a2b5bb493	2018-12-26 06:55:01 -08:00
James Sun	88bf683cbc	Support error handling in forked threads (#14523 ) Summary: Save error info in the future for parent thread to pick up. Throw the error when the thread is the root thread. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14523 Differential Revision: D13251756 Pulled By: highker fbshipit-source-id: b40f9a45665e1a934743f131ec5e8bad5622ce67	2018-12-19 18:54:46 -08:00
Peter Goldsborough	73ee7fda4c	Remove deprecated variable_tensor_functions (#15003 ) Summary: Removing the deprecated functions in `torch/csrc/variable_tensor_functions.h` (like `torch::CPU`) and corresponding implementations from `torch/csrc/torch.cpp` from master after the release. ezyang gchanan soumith Pull Request resolved: https://github.com/pytorch/pytorch/pull/15003 Differential Revision: D13418086 Pulled By: goldsborough fbshipit-source-id: a0accdf6f7b0efa1ec07ac7b74b86ff2da37543f	2018-12-11 17:16:11 -08:00
Edward Yang	517c7c9861	Canonicalize all includes in PyTorch. (#14849 ) Summary: Anywhere we used #include "foo.h", we now say #include <foo.h> Paths are adjusted to be rooted out of aten/src, torch/lib, or the root level directory. I modified CMakeLists.txt by hand to remove TH and THC from the include paths. I used the following script to do the canonicalization: ``` import subprocess import re import os.path files = subprocess.check_output(['git', 'ls-files']).decode('utf-8').rstrip().split('\n') for fn in files: if not any(fn.endswith(suff) for suff in ['.cu', '.cpp', '.in', '.h', '.hpp', '.cu', '.cuh', '.cc']): continue if not any(fn.startswith(pref) for pref in ["aten/", "torch/"]): continue with open(fn, 'r') as f: c = f.read() def fmt(p): return "#include <{}>".format(p) def repl(m): p = m.group(1) if p in ["dlfcn.h", "unistd.h", "nvrtc.h", "cuda.h", "cuda_runtime.h", "cstdint", "cudnn.h", "Python.h", "cusparse.h", "cuda_runtime_api.h", "cuda_fp16.h", "cublas_v2.h", "stdint.h", "curand_kernel.h"]: return fmt(p) if any(p.startswith(pref) for pref in ["torch/csrc", "c10/", "ATen/", "caffe2/", "TH/", "THC/", "Eigen/", "gtest/", "zdl/", "gloo/", "onnx/", "miopen/"]): return fmt(p) for root in ["aten/src", "torch/lib", ""]: for bad_root in [os.path.dirname(fn), "aten/src/TH", "aten/src/THC", "torch/csrc"]: new_p = os.path.relpath(os.path.join(bad_root, p), root) if not new_p.startswith("../") and (os.path.exists(os.path.join(root, new_p)) or os.path.exists(os.path.join(root, new_p + ".in"))): return fmt(new_p) print("ERROR: ", fn, p) return m.group(0) new_c = re.sub(r'#include "([^"]+)"', repl, c) if new_c != c: print(fn) with open(fn, 'w') as f: f.write(new_c) ``` Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/14849 Reviewed By: dzhulgakov Differential Revision: D13363445 Pulled By: ezyang fbshipit-source-id: 52361f878a672785f9306c9e9ab2513128092b68	2018-12-08 19:38:30 -08:00
James Sun	186341c5dc	Merge Caffe2 and PyTorch thread pool definitions (#14114 ) Summary: (1) Move Caffe2 thread pool to aten (2) Use the same thread pool definition for PyTorch interpreter (3) Make ivalue::Future thread-safe Pull Request resolved: https://github.com/pytorch/pytorch/pull/14114 Reviewed By: ilia-cher Differential Revision: D13110451 Pulled By: highker fbshipit-source-id: a83acb6a4bafb7f674e3fe3d58f7a74c68064fac	2018-11-28 18:10:20 -08:00
James Sun	d02781a2ef	Make InterpresterStateImpl a intrusive_ptr_target (#13784 ) Summary: InterpresterStateImpl con continue its lifecycle by increment the ref count itself. This patch also removes InterpresterState::clone() interface that conflicts with intrusive_ptr_target that disallows copy. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13784 Differential Revision: D13015451 Pulled By: highker fbshipit-source-id: a05f1ea6549d52ec693ccffefaa4d520b2474b8c	2018-11-09 23:39:18 -08:00
James Sun	dca3c2c60f	Save and execute futures in a task queue (#13212 ) Summary: Upon calling wait(), save the forked thread and the current thread to a task queue. A idling thread (which currently is single threaded) should pick a ready task and run till there is nothing in the task queue. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13212 Differential Revision: D12884522 Pulled By: highker fbshipit-source-id: b3942a0ee63c148e05f5f41bdc73007fa3c3368e	2018-11-09 01:46:35 -08:00
Peter Goldsborough	0479517325	Add modernize-* checks to clang-tidy (#13196 ) Summary: Enables almost all `modernize-*` checks in clang-tidy. This warns against things such as: - Use of `const std::string&` instead of new-style `std::string` + move, - Using old-style loops instead of range-for loops, - Use of raw `new` - Use of `push_back` instead of `emplace_back` - Use of `virtual` together with `override` (`override` is sufficient) ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/13196 Differential Revision: D12891837 Pulled By: goldsborough fbshipit-source-id: 4d0f782a09eb391ee718d3d66f74c095ee121c09	2018-11-02 20:30:40 -07:00
mruberry	6fe089c6ea	Hierarchical device independent -> device specific architecture (#13108 ) Summary: This PR principally redesigns the fuser's logical flow to be hierarchical, with device-independent logic directing (relatively little) device-specific logic. This design is based on reviews of XLA, TVM, internal design review at NVIDIA and discussions with fuser owners at Facebook. To further vet the design I have begun developing the next significant PR (extended fusion logic) on top of this architecture and it has made the work significantly easier. This PR also improves fuser modularity, which should make it easier for others to contribute to. Unfortunately, this PR is large and its nature has made breaking it into smaller pieces challenging. Future PRs should be smaller. The fusion flow is now: - Fusions are "registered" and "upfront compilation" occurs. The fusion specifications, which includes the graph, go into a thread-safe device-independent cache. Upfront compilation generates some information used later during shape inference. - Fusions are run, which passes them to an executor that performs shape inference, requests an instantiated fusion from the specification's thread-safe store, and launches them. Launch logic eventually defers to device-specific logic. - Fusions not previously instantiated are compiled. Compilation is device-specific and arg-specific. Compilation logic eventually defers to device-specific logic. - If the fusion could not be run because fusion on the requested device is disabled or shape inference fails a fallback is invoked. This flow can be thought of as PyTorch IR -> Device-Independent Fusion Logic -> Device-Specific Fusion Logic. The current upstream logic is, by contrast, PyTorch IR -> Device-Specific Logic -> Device-Independent Logic, which results in needless code duplication and lack of conceptual clarity. That was my mistake when splitting the fuser off from the rest of the jit and our reviews since then have been incredibly helpful in understanding why the approach in this PR is better. This PR does not only move code around. It also fixes few couple bugs and makes some logical/code changes. Bug fixes: - thread-safety is improved with caches preventing concurrent access - the nvrtc version is now reviewed to determine the appropriate compute architecture to compile for, fixing a bug that would cause runtime errors if a user's nvrtc didn't support the compute architecture their gpu reported - an issue with DeviceGuard not setting the device properly and failing silently is worked-around (ezyang mentioned he was reviewing the dynamic registration DeviceGuard uses, which may resolve the issue) Code/Logical changes: - "const" now appears many more places (note: I cast const away in operator.h because of some obscure build issues -- I think we should be able to fix this and will take a look while this goes through testing) - The new flow allowed some redundant code to be removed (AnnotatedGraph is gone, for example, and the more straightforward flow eliminated duplication of effort elsewhere) - Fallback logic is now also invoked if a fusion is requested on a device that cannot handle fusions - Use of macros to determine which files are compiled is reduced (though they may come back if the Windows build is unhappy) - There is no more "common" code or folder, the device-independent logic being at the forefront of the fuser replaces and improves upon the goal of sharing code apaszke who I promised naming rights to zdevito who correctly pointed out that the device-independent logic should be the bulk of what the fuser is doing ngimel who contributed to the design of this architecture Pull Request resolved: https://github.com/pytorch/pytorch/pull/13108 Reviewed By: gchanan, fmassa Differential Revision: D12850608 Pulled By: soumith fbshipit-source-id: 24e2df6dfa97591ee36aeca8944519678c301fa3	2018-10-31 18:13:00 -07:00
Elias Ellison	59f8e8ada7	First step at adding exceptions (#12789 ) Summary: This is a first step towards adding exceptions. We need minimal support in order to begin converting the torch library to weak script mode (which is the main goal here). Some limitations (that are documented in the tests & compiler): 1. Cannot assign exceptions to variables 2. Any name after raise is being treated as a valid Exception 3. No control flow analysis yet. Below a will be undefined: if True: a = 1 else: raise Exception("Hi") return a Pull Request resolved: https://github.com/pytorch/pytorch/pull/12789 Differential Revision: D12848936 Pulled By: eellison fbshipit-source-id: 1f60ceef2381040486123ec797e97d65b074862d	2018-10-30 20:25:50 -07:00
Yangqing Jia	713e706618	Move exception to C10 (#12354 ) Summary: There are still a few work to be done: - Move logging and unify AT_WARN with LOG(ERROR). - A few header files are still being plumbed through, need cleaning. - caffe2::EnforceNotMet aliasing is not done yet. - need to unify the macros. See c10/util/Exception.h This is mainly a codemod and not causing functional changes. If you find your job failing and trace back to this diff, usually it can be fixed by the following approaches: (1) add //caffe2/c10:c10 to your dependency (or transitive dependency). (2) change objects such as at::Error, at::Optional to the c10 namespace. (3) change functions to the c10 namespace. Especially, caffe2::MakeString is not overridden by the unified c10::str function. Nothing else changes. Please kindly consider not reverting this diff - it involves multiple rounds of rebasing and the fix is usually simple. Contact jiayq@ or AI Platform Dev for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12354 Reviewed By: orionr Differential Revision: D10238910 Pulled By: Yangqing fbshipit-source-id: 7794d5bf2797ab0ca6ebaccaa2f7ebbd50ff8f32	2018-10-15 13:33:18 -07:00
Zachary DeVito	bd09ab6687	Remove stages from IR, they are not longer used Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12352 Differential Revision: D10219743 Pulled By: zdevito fbshipit-source-id: 4d9441dc3748616f9b1f0734c65ec1a7abb0d663	2018-10-05 13:58:15 -07:00
David Riazati	d1ac1eba3b	Add `bool` type to IR (#11834 ) Summary: This PR adds a bool type to `IValue` and puts it into place. * changes conds for `prim::If` and `prim::Loop` to use `bool` type * changes operators that take `bool`s to match their native ops * fixes ambiguous `aten` ops `aten::std` and `aten::var` * fixes tests in `test_jit.py TestJitGenerated` ``` 'test_std_dim', 'test_std_dim_1d', 'test_std_dim_1d_neg0', 'test_std_dim_neg0', 'test_var_dim', 'test_var_dim_1d', 'test_var_dim_1d_neg0', 'test_var_dim_neg0' ``` * adds `prim::BoolToTensor` and `prim::TensorToBool` apaszke zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/11834 Differential Revision: D9928570 Pulled By: driazati fbshipit-source-id: 373c53df2f1a8ffa9e33d9a517002fbeef25f3eb	2018-10-03 12:40:03 -07:00
Michael Suo	7f35e92af2	mutable lists (#10700 ) Summary: This PR implements the design that we discussed. Changes: - Added a World token IValue and type. The IValue is basically a dummy struct for now, in the future we may extend it (say, add thread-local state). - Effectful ops explicitly declare they are mutable by having World tokens as inputs and outputs in their schema. - Purely functional ops that use mutable values will get "fenced" and the world token will be threaded through the fences - AnnotateEffects pass which wires up all the world tokens together. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10700 Reviewed By: eellison Differential Revision: D9547881 Pulled By: michaelsuo fbshipit-source-id: ebbd786c31f15bf45e2ddb0c188438ff2f5f3c88	2018-09-27 19:25:13 -07:00
Edward Yang	11bd2f2509	Retainable is no more (#11900 ) Summary: Stack:     ⚫  #11900 Retainable is no more  [💛](https://our.intern.facebook.com/intern/diff/D9977505/)     ⚪  #11902 Refactor fastGet/fastSet for clarity, removing a null pointer check.  [💛](https://our.intern.facebook.com/intern/diff/D9977654/) Kill it with fire Pull Request resolved: https://github.com/pytorch/pytorch/pull/11900 Differential Revision: D9979779 Pulled By: ezyang fbshipit-source-id: 0a437e7a0baadb6440e7dc39a01b4a406171faa7	2018-09-21 06:58:18 -07:00
Edward Yang	f6a6d7fae1	Switch at::TensorImpl to store TypeMeta rather than ScalarType Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11702 Reviewed By: cpuhrsch Differential Revision: D9831384 fbshipit-source-id: 1b1233a70ed70b47a3dab4a5797b6cfcb7a2c265	2018-09-17 09:09:35 -07:00
Mike Ruberry	96d3f968eb	Splits CPU and CUDA fusion compilers (#10981 ) Summary: This PR splits the CPU and CUDA fusion compilers, putting them into a new jit/fusers/ directory with jit/fusers/common for common components. In particular: - A fusion interface is created that allows "fusion handles" to be requested - The CPU and CUDA fusers implement this interface, with dispatch determined by device - The fusion compilers, fusion function specializations and resource strings are split - CPU-specific classes like TempFile and DynamicLibrary are in the CPU fuser - Common classes likes TensorDesc and the base fusion function class are in jit/fusers/common - There is still some specialization in jit/fusers/common, but these specializations are small(-ish) - Updates the build system to remove the dummy interface on Windows and minimize the use of macros This structure should allow in-flight PRs to easily rebase while providing a clear interface to the fusers. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10981 Reviewed By: soumith Differential Revision: D9701999 Pulled By: apaszke fbshipit-source-id: 3b6bec7b97e0444b2a93caa38d9b897f2e68c1b3	2018-09-14 14:05:34 -07:00
Gregory Chanan	cee743f639	Move backward/set_data to Type-based dispatch. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11440 Differential Revision: D9736565 Pulled By: gchanan fbshipit-source-id: 1e66f54f1c87084f37c0b014030f0d6d2f8dfaee	2018-09-10 08:40:29 -07:00
Adam Paszke	00df09b65d	Change specialization rules in GraphExecutors (#10977 ) Summary: Review last commit only. Stacked on top of #10949. This commit fixes a number of issues connected to caching differentiability status of graphs inside graph executors, and changes the rules for optimization of differentiable subgraphs. Previously every one of those was instantiated as a separate graph executor, but now they are simply heavier-optimized graph regions, and graph executors are only instantiated for their backward. zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10977 Differential Revision: D9600626 Pulled By: apaszke fbshipit-source-id: dad09a0f586e396afbd5406319c1cd54fbb8a3d3	2018-08-30 22:11:01 -07:00
Edward Yang	f7b02b3a68	Change Tensor/TensorImpl to use c10::intrusive_ptr (#10824 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10824 API additions: - Tensor(c10::intrusive_ptr<TensorImpl,UndefinedTensor>&&) - Tensor(const c10::intrusive_ptr<TensorImpl,UndefinedTensor>&) - Tensor::operator=(Tensor&&) && (for completeness sake) - TensorBase::unsafeGetTensorImpl() - TensorBase::unsafeReleaseTensorImpl() - TensorBase::getIntrusivePtr() - TensorImpl::type_id() - Tensor::set_data() - Tensor::is_same(Tensor) - Tensor::use_count() - Tensor::type_id() - Tensor::scalar_type() - WeakTensor::is_same(WeakTensor) - intrusive_ptr::weak_use_count() - weak_intrusive_ptr::weak_use_count() - c10::raw::intrusive_ptr::{incref,decref,make_weak} - c10::raw::weak_intrusive_ptr::{incref,decref,lock} API changes: - Tensor::pImpl is no longer public (and now named tensor_impl_) - Most methods accessed this way are now accessible on Tensor maybe_zero_dim() and set_wrapped_number() being prominent exceptions (they are now accessed through unsafeGetTensorImpl()) - Type is no longer friend of Tensor - TensorBase::reset(TensorImpl) is deleted - TensorBase::reset(TensorImpl, bool should_retain) is deleted - TensorBase::swap(TensorBaseImpl&) is deleted; use std::swap instead - TensorBase::get() is deleted; use unsafeGetTensorImpl() instead - TensorBase::detach() is deleted; use unsafeReleaseTensorImpl() instead - TensorBase::retain() is deleted; use _raw_incref() instead - TensorBase::release() is deleted; use _raw_decref() instead - WeakTensor lost most of its methods (it no longer inherits from TensorBase) - TensorImpl::storage() is now a const method - Tensor(TensorBase) constructor removed, instead we go through getIntrusivePtr(). I'm not sure about this change; I happened to have accidentally removed the TensorBase constructor and decided to fix call sites, but I could go the other way. - detail::set_data() is deleted; use Tensor::set_data() instead - c10::raw_intrusive_ptr_target removed; use the functions in c10::raw instead. (The reason for this change, is that it is invalid to cast an intrusive_ptr_target* to a raw_intrusive_ptr_target* to take advantage of the methods. But there is no reason the incref/decref methods shouldn't also work on intrusive_ptr_target; it is primarily an API consideration. We can be more standards compliant by keeping them as functions, which are universally applicable.) - intrusive_ptr::reclaim() and weak_intrusive_ptr::reclaim() now work on pointers of the NullType. (This counts as a bug fix, because the documentation specified that pointers produced by release() are valid to reclaim(), and a release() on a null intrusive_ptr produces the NullType::singleton()) Bug fixes: - Dispatch code for mutable references incorrectly returned a reference to a value argument (which would immediately go out of scope). They now correctly return a tensor by value. - intrusive_ptr copy/move assignment did not work correctly when an object was assigned to itself. We now check for this case and no-op if so. (This bug manifested itself as a Tensor mysteriously becoming an UndefinedTensor after lines of code like 'x = x.mul_(y)') Other changes: - The checked cast functions in Utils.h have now been renamed and detemplatized into checked unwrap functions. - Added type_id() and scalar_type() methods to Tensor - pImpl is no longer public - Documented what the && overloads are doing - All occurrences of 'new TensorImpl' (and similar spellings, like 'new THTensor') have been expunged. This is NO LONGER a valid way to create a new tensor, and if you do this, upon your first incref, you will catch an ASSERT failure saying that only tensors created by intrusive_ptr::release() are valid to reclaim(). Use c10::make_intrusive instead in this situation. - IValue is adjusted to use intrusive_ptr instead of Retainable, and all other sub-classes of Retainable were modified to use intrusive_ptr. When doing this, I had to make the constructors of sub-classes like ConstantList public, so that c10::make_intrusive could invoke them. Fortunately, if you incorrectly stack allocate a ConstantList, and then try to get an intrusive_ptr to it, it will fail, as stack allocated ConstantLists have refcount 0. - IValue very narrowly sidesteps the problem of handling NullType, as it considers intrusive_ptr<TensorImpl> identical to intrusive_ptr<TensorImpl, UndefinedTensor> which is not always true. This was always the case, but there's now a comment explaining what's going on. Some MSVC bugs were uncovered during the preparation of this patch. They are documented as comments in the code. Reviewed By: gchanan Differential Revision: D9481140 fbshipit-source-id: 14a8ea0c231ed88b5715fb86d92730926f9f92fc	2018-08-27 16:11:01 -07:00
Adam Paszke	c8b246abf3	Prevent JIT from overspecializing to every single size configuration (#10844 ) Summary: Please review the expects carefully to make sure there are no regressions. I tried to go over them one by one when they changed, but it's sometimes easy to miss finer details. Summary of changes: - Renamed `TensorType` to `CompleteTensorType`. Added a new `TensorType` which records only the scalar type, number of dimensions, and device of a value. The argument behind the rename is to encourage people to use `CompleteTensorType` less, as most passes will only have limited information available. To make transition easier `complete_type->cast<TensorType>()` works, and makes our passes work with both kinds of specialization if they don't need extra the extra detail. - Renamed `ArgumentSpec` to `CompleteArgumentSpec`. Added a new `ArgumentSpec`, which matches argument only at the level of the new `TensorType`. - Shape analysis can process graphs with both `CompleteTensorType` and `TensorType`. - Fuser was a part that heavily relied on full shape information being available. Now, we simply try to fuse the largest possible graphs, and have to do run-time checks to make sure they match the code we generate. If they don't, we fall back to regular interpretation. The shape checks are implementing using an optimized method exploiting algebraic properties of shapes with broadcasting, and the relations of broadcasting with pointwise ops. A full written proof of correctness of the shape checking algorithm is included in a comment in `graph_fuser.cpp`. zdevito ezyang mruberry ngimel csarofeen Pull Request resolved: https://github.com/pytorch/pytorch/pull/10844 Differential Revision: D9498705 Pulled By: apaszke fbshipit-source-id: 0c53c2fcebd871cc2a29c260f8d012276479cc61	2018-08-26 09:54:48 -07:00
Edward Yang	19031c68dc	Use intrusive_ptr in Storage; replace unique_ptr<Storage> with Storage (#10488 ) Summary: ``` Use intrusive_ptr in Storage; replace unique_ptr<Storage> with Storage This patch does two major changes: - It replaces the use of Retainable in Storage with a new implementation based on intrusive_ptr. This will be necessary because Caffe2 will be using this class to implement intrusive_ptrs, and we need to line these up for the merge. One good thing about the new implementation is that the default copy/move constructors/assignment operators and destructor work automatically, instead of needing to be hardcoded into Storage/Tensor. - It replaces all places where we returned std::unique_ptr<Storage> with Storage, collapsing an unnecessary double indirection that is no longer necessary now that we have correctly working copy/move constructors. I didn't initially want to do step (2), but it was very important to eliminate all bare uses of new Storage and new StorageImpl, and this making the API change was the most straightforward way to do this. HOW TO FIX YOUR CODE IN THE NEW API - You no longer need to dereference the result of tensor.storage() to pass it to set. So, instead of: x.set_(*y.storage()); just write: x.set_(y.storage()); - If you were accessing methods on StorageImpl via the pImpl() method, you must use the dot operator to run pImpl(). Even better; just drop pImpl, we now have method forwarding. So, instead of: storage->pImpl()->data(); just do: storage->data(); // storage.pImpl()->data() works too but is not as recommended - storage->getDevice() is no more; instead use storage->device().index() MISC CODE UPDATES - retain, release, weak_retain, weak_release and weak_lock are now reimplemented using the "blessed API", and renamed to make it clearer that their use is discouraged. - nvcc OS X and general OS X portability improvements to intrusive_ptr - A new comment in intrusive_ptr describing how stack allocated intrusive_ptr_targets work differently than heap allocated ones from c10::make_intrusive CAVEAT EMPTOR - THStorage_weakRetain used to work on strong pointers, but it NO LONGER works with intrusive_ptr. You must reclaim the strong pointer into a real strong pointer, construct a weak pointer from it, and then release the strong and weak pointers. See StorageSharing.cpp for an example. ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/10488 Reviewed By: gchanan Differential Revision: D9306134 Pulled By: ezyang fbshipit-source-id: 02d58ef62dab8e4da6131e1a24834a65c21048e2	2018-08-21 21:39:55 -07:00
Gregory Chanan	00f2731112	Merge THTensor into TensorImpl Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10479 Differential Revision: D9315800 Pulled By: gchanan fbshipit-source-id: b13ef0de3342600b02b54e0700eb02021a9d1a9e	2018-08-16 08:10:06 -07:00
Edward Yang	64235d5c01	Rewrite TensorImpl to use TensorTypeId. (#10278 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10278 Translation to Backend happens immediately before we go into the Type universe; otherwise we use TensorTypeId. I allocated TensorTypeId corresponding exactly to existing ATen Backend. Only CPUTensorId and CUDATensorId are relevant in the Caffe2 universe. Reviewed By: gchanan Differential Revision: D9184060 fbshipit-source-id: 9d3989c26f70b90f1bbf98b2a96c57e2b0a46597	2018-08-13 11:20:04 -07:00

1 2 3

101 Commits