Commit Graph

101 Commits

Author SHA1 Message Date
Sebastian Messmer
b527e48588 Use c10::List (#21177)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21177

- Integrate c10::ListPtr into IValue and the c10 dispatcher.
- Streamline conversion to/from IValue. Before, we had IValue::to<> and kernel_functor.h had its own ivalue_to_arg_type and return_type_to_ivalue. They are now unified. Also, this means that nested types like Dicts of Lists of Optional of Dict of ... do work as expected now

Differential Revision: D15476433

fbshipit-source-id: bde9df80df20091aa8e6ae17ba7e90abd149b954
2019-06-12 13:58:24 -07:00
Michael Suo
cab3e726df Split out Function into its own file (#21539)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21539
ghimport-source-id: f1e4396a0bec6e30d3179f926ec4da68807942f7

Differential Revision: D15741979

Pulled By: suo

fbshipit-source-id: 4cd0ed36bcbf8db0b36a101dda6f58975f806889
2019-06-10 16:37:58 -07:00
Zachary DeVito
ea822d9626 Interpreter support for CallFunction/CallMethod (#21562)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21562
ghimport-source-id: 17e5e183f730f50d97ef48973aafc6249d54978f

Reviewed By: suo

Differential Revision: D15729500

Pulled By: zdevito

fbshipit-source-id: efa8a133b617b1498810392a8da6b513ce00b5eb
2019-06-09 15:28:26 -07:00
Zachary DeVito
18996a8952 unfinished push/pop reduction (#21559)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21559
ghimport-source-id: 81ba4a5638577781e1ea706599966c033c37e814

Reviewed By: suo

Differential Revision: D15729501

Pulled By: zdevito

fbshipit-source-id: 3423bff61e89617c40078d5fab726b77d21bfa27
2019-06-09 15:28:16 -07:00
Zachary DeVito
13edda417d Prepare interpreter for function calling (#21558)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21558
ghimport-source-id: a8a19dbefea869ca1401e5afea6c02f31f95b99a

Reviewed By: suo

Differential Revision: D15729491

Pulled By: zdevito

fbshipit-source-id: 9629664608a2379a2ddcafaf741fa8463c4fb917
2019-06-09 15:28:13 -07:00
Zachary DeVito
d71501259b Revert D15572818: Prepare interpreter for function calling
Differential Revision:
D15572818

Original commit changeset: 3a9b5f053664

fbshipit-source-id: b932411e8e88c7414c8db332d6049fe4e26bd83e
2019-06-07 22:20:54 -07:00
Zachary DeVito
d4bcab0dba Revert D15590900: Reduce number of stack manipulation instructions in interpreter.
Differential Revision:
D15590900

Original commit changeset: 98829979feba

fbshipit-source-id: eb7f1d396bb2b98d2852af81c69db81430eba33c
2019-06-07 22:20:50 -07:00
Zachary DeVito
bfb235b8c9 Revert D15618275: Interpreter support for CallFunction/CallMethod
Differential Revision:
D15618275

Original commit changeset: 038ae27e5416

fbshipit-source-id: 8dbe0f564ba103fe445dacc471085c659171705f
2019-06-07 22:20:40 -07:00
Zachary DeVito
5f6afafdef Interpreter support for CallFunction/CallMethod (#21325)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21325
ghimport-source-id: eeca1176f5e00c85a69cd016acccf5105e670e02

Reviewed By: jamesr66a

Differential Revision: D15618275

Pulled By: zdevito

fbshipit-source-id: 038ae27e5416f1ce338009627c839a4d61a00658
2019-06-07 20:56:58 -07:00
Zachary DeVito
dde27958dd Reduce number of stack manipulation instructions in interpreter. (#21240)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21240
ghimport-source-id: 5e9cbe8b3df3ac721135d2f652a420ae0b14ac55

Reviewed By: jamesr66a

Differential Revision: D15590900

Pulled By: zdevito

fbshipit-source-id: 98829979feba23685f0ba98ba3cb840157f7259a
2019-06-07 20:56:49 -07:00
Zachary DeVito
c53e4d012d Prepare interpreter for function calling (#21185)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21185
ghimport-source-id: 6b9cb92d1f1f59bb980dcfa0d29dfe985ee955d1

Reviewed By: jamesr66a

Differential Revision: D15572818

Pulled By: zdevito

fbshipit-source-id: 3a9b5f053664c09212b97f1391d8d006337b5550
2019-06-07 20:56:46 -07:00
Ilia Cherniavskii
409200df59 Move inter-op settings into ATen/Parallel (#20050)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20050
ghimport-source-id: cc102bab8abf3e56c099245976786317ed63ea14

Differential Revision: D15248576

Pulled By: ilia-cher

fbshipit-source-id: 55ddcb7af387ddfc68a42ac7167de07ea648e249
2019-05-17 03:12:02 -07:00
Edward Yang
97e1f07ffc Replace AT_CHECK with TORCH_CHECK [shard 10/10]
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20436

Reviewed By: jerryzh168

Differential Revision: D15318926

fbshipit-source-id: 71a43070cc50cc174f703ebc595f1d87c6fc1e91
2019-05-15 07:35:37 -07:00
Zachary DeVito
3afd99680c Remove SourceLocation (respin) (#20333)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20333
ghimport-source-id: e64075bb82067224463e9955d10bd13967d1975d

Differential Revision: D15284081

Pulled By: zdevito

fbshipit-source-id: ac26ae48392b9daff08f460529c06af8f4e4722a
2019-05-09 16:17:33 -07:00
Wanchao Liang
e870b11ae6 Revert D15275731: Remote SourceLocation
Differential Revision:
D15275731

Original commit changeset: f4da178c3137

fbshipit-source-id: 830b79735eb2dadc4795b5aae407826bf20ef121
2019-05-09 13:07:11 -07:00
Zachary DeVito
eca91de5d2 Remote SourceLocation (#20300)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20300
ghimport-source-id: 06f606c4db3b70b1d2ed9f6ed4542c3f703c4e17

Differential Revision: D15275731

Pulled By: zdevito

fbshipit-source-id: f4da178c31372c2264feb9f99476b9c9aa66c1f2
2019-05-09 11:48:29 -07:00
Mikhail Zolotukhin
8b46938355 Cleanup includes in torch/csrc/jit/* (#19922)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19922
ghimport-source-id: 0434c46bf75621ff79ea27a18a2475e7f13e2487

Differential Revision: D15125015

Pulled By: ZolotukhinM

fbshipit-source-id: 5685edfc94067f62e363a85e9badb7f757b1d321
2019-05-06 13:40:26 -07:00
Gregory Chanan
043e363c6c Cache device on TensorImpl; clean up TensorImpl constructors. (#18833)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18833
ghimport-source-id: 6f2be25fcc5e6be3ffe20582e604bd2c1fbab66b

Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18833 [STACK] Cache device on TensorImpl; clean up TensorImpl constructors.**
* #18832 [STACK] Disallow changing the device of a tensor via set_.
* #18831 [STACK] Stop swapping in Storages of the wrong device for Tensors.

1) We cache device on TensorImpl.  This means we can access the device without a virtual function and allows us to more easily extend TensorImpls (because they don't need to figure out how to store the Device for themselves).

2) Clean up TensorImpl APIs.  We had a constructor that took a TensorTypeId and an allocator and would allocate a Storage based on the recognized types of TensorTypeIds.  Instead, we just have two different constructors: one for types with a storage, one without.

Reviewed By: dzhulgakov

Differential Revision: D14766230

fbshipit-source-id: 745b8db84dcd6cb58f1a8675ad3ff8d033bc50df
2019-04-05 07:21:39 -07:00
James Reed
85f36014e2 Experimental logging/counters API (#18235)
Summary:
This defines a generic counters API that users can utilize to provide monitoring functionality in e.g. a production service. We expose both counters for runtime internals as well as a TorchScript API to create user-defined counters. Synopsis of the API:

- `torch/csrc/jit/script/logging.h` specifies the externally-facing API in C++
- `torch/jit/_logging.py` specifies the Python API

We use an interface, `LoggerBase`, to define the interactions between users and a logging backend. Implementing a subclass of `LoggerBase` allows the user to handle these events in a custom way, such as logging into a DB or calling into an infra-specific counters API.

From the frontend perspective, we can create log events in two ways:
1. We provide an `add_stat_value(name, val)` function. This calls into the Logger backend with a key/value pair. For example, we might call `add_stat_value('foo', 1)` to bump an event counter.
2. We provide a `time_point()` function to record a timestamp in nanoseconds. This can be used in conjunction with `add_stat_value` to record runtime wall clock durations.

Examples of frontend usage can be found in `test_jit.py TestLogging`.

We provide a trivial `LockingLogger` implementation as an example and for testing purposes. It is likely not ready for production usage. It demonstrates that a backend implementing the API can do things like specify aggregation types and report these aggregate stats via the `get_counters()` API.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18235

Differential Revision: D14545060

Pulled By: jamesr66a

fbshipit-source-id: 04099543a1898cfdd411511e46e03d5dce9b4881
2019-03-29 17:14:03 -07:00
James Reed
1d26a3ae7e Open registration for c10 thread pool (#17788)
Summary:
1. Move ATen threadpool & open registration mechanism to C10
2. Move the `global_work_queue` to use this open registration mechanism, to allow users to substitute in their own
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17788

Reviewed By: zdevito

Differential Revision: D14379707

Pulled By: jamesr66a

fbshipit-source-id: 949662d0024875abf09907d97db927f160c54d45
2019-03-08 15:38:41 -08:00
Edward Yang
4404762d7d Rename IntList to IntArrayRef. (#16751)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16751

This was made more complicated by the fact that ivalue::IntList
is a thing.  So I had to fix all of the sites where we referring
to IValue post facto.

The following codemods were run, in this order:

```
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in IntList IntArrayRef
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in IntArrayRef::create IntList::create
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in ivalue::IntArrayRef ivalue::IntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in Tag::IntArrayRef Tag::IntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in isIntArrayRef isIntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in toIntArrayRef toIntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in 'Shared<IntArrayRef>' 'Shared<IntList>'
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in 'intrusive_ptr<IntArrayRef>' 'intrusive_ptr<IntList>'
```

Some manual fixups were done afterwards; they can be reviewed separately
at https://github.com/pytorch/pytorch/pull/16752

Reviewed By: dzhulgakov

Differential Revision: D13954363

fbshipit-source-id: b5c40aacba042402155a2f5a229fa6db7992ac64
2019-02-05 14:54:34 -08:00
Zachary DeVito
c42431bd7a Revert D13740752: [c10] plug caffe2 into jit
Differential Revision:
D13740752

Original commit changeset: 2d9383574d42

fbshipit-source-id: e9ff217a438720423340a10af7fa263b33f2ae24
2019-01-25 12:29:19 -08:00
Bram Wasti
6d2aee4a9b plug caffe2 into jit (#16331)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16331

Temporary measure to enable caffe2 ops in pytorch

Reviewed By: smessmer

Differential Revision: D13740752

fbshipit-source-id: 2d9383574d42ce84ee471aba32eeb4f5a0cc7a4c
2019-01-24 22:28:21 -08:00
Mikhail Zolotukhin
47bf30661f Directly include headers from ATen.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16287

Differential Revision: D13792949

Pulled By: ZolotukhinM

fbshipit-source-id: d627d8dc469df048063c70d0b5b8d33fede809a3
2019-01-24 11:22:27 -08:00
Sebastian Messmer
0ab8de3125 Remove some dependencies from ivalue.h to ATen (#15855)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15855

This is preparation work for moving IValue to c10.

Reviewed By: ezyang

Differential Revision: D13605259

fbshipit-source-id: cc545f582ab8607bb02aaf71273cb2710200b295
2019-01-17 16:03:58 -08:00
Guoqiang Jerry Chen
6641b09fac respect grad guard for torch.jit._fork and torch.jit._wait (#16101)
Summary:
respect grad guard for torch.jit._fork and torch.jit._wait.

Verified that the test failed without the fix, and pass with the fix.

Ideally I would like to enable and disable grad inside the forked function.
It doesn't seems like it's supported at this moment. This code handles that
as well.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16101

Differential Revision: D13708374

Pulled By: gqchen

fbshipit-source-id: 0533f080c4d0253fb4c61d2a0d3cc22de5721a09
2019-01-17 11:12:57 -08:00
Michael Suo
f636dc9276 clang format world (#15524)
Summary:
The PR clang-formats everything in `torch/csrc/jit/` and adds it to the pre-commit hook.

Here is a list of non-mechanical changes:
- I went over each file and fixed up whenever I could tell that clang-format was clobbering comment formatting.
- Made the macros in register_prim_ops a little more clang-format friendly by omitting trailing commas
- Refactored autodiff.cpp to use a helper class with explicit state rather than a bunch of capturing lambdas
- Small improvements to the precommit hook clang-format
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15524

Differential Revision: D13547989

Pulled By: suo

fbshipit-source-id: 3ff1541bb06433ccfe6de6e33f29227a2b5bb493
2018-12-26 06:55:01 -08:00
James Sun
88bf683cbc Support error handling in forked threads (#14523)
Summary:
Save error info in the future for parent thread to pick up. Throw the error
when the thread is the root thread.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14523

Differential Revision: D13251756

Pulled By: highker

fbshipit-source-id: b40f9a45665e1a934743f131ec5e8bad5622ce67
2018-12-19 18:54:46 -08:00
Peter Goldsborough
73ee7fda4c Remove deprecated variable_tensor_functions (#15003)
Summary:
Removing the deprecated functions in `torch/csrc/variable_tensor_functions.h` (like `torch::CPU`) and corresponding implementations from `torch/csrc/torch.cpp` from master after the release.

ezyang gchanan soumith
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15003

Differential Revision: D13418086

Pulled By: goldsborough

fbshipit-source-id: a0accdf6f7b0efa1ec07ac7b74b86ff2da37543f
2018-12-11 17:16:11 -08:00
Edward Yang
517c7c9861 Canonicalize all includes in PyTorch. (#14849)
Summary:
Anywhere we used #include "foo.h", we now say #include <foo.h>
Paths are adjusted to be rooted out of aten/src, torch/lib, or
the root level directory.

I modified CMakeLists.txt by hand to remove TH and THC from
the include paths.

I used the following script to do the canonicalization:

```
  import subprocess
  import re
  import os.path

  files = subprocess.check_output(['git', 'ls-files']).decode('utf-8').rstrip().split('\n')
  for fn in files:
      if not any(fn.endswith(suff) for suff in ['.cu', '.cpp', '.in', '.h', '.hpp', '.cu', '.cuh', '.cc']):
          continue
      if not any(fn.startswith(pref) for pref in ["aten/", "torch/"]):
          continue
      with open(fn, 'r') as f:
          c = f.read()
      def fmt(p):
          return "#include <{}>".format(p)
      def repl(m):
          p = m.group(1)
          if p in ["dlfcn.h", "unistd.h", "nvrtc.h", "cuda.h", "cuda_runtime.h", "cstdint", "cudnn.h", "Python.h", "cusparse.h", "cuda_runtime_api.h", "cuda_fp16.h", "cublas_v2.h", "stdint.h", "curand_kernel.h"]:
              return fmt(p)
          if any(p.startswith(pref) for pref in ["torch/csrc", "c10/", "ATen/", "caffe2/", "TH/", "THC/", "Eigen/", "gtest/", "zdl/", "gloo/", "onnx/", "miopen/"]):
              return fmt(p)
          for root in ["aten/src", "torch/lib", ""]:
              for bad_root in [os.path.dirname(fn), "aten/src/TH", "aten/src/THC", "torch/csrc"]:
                  new_p = os.path.relpath(os.path.join(bad_root, p), root)
                  if not new_p.startswith("../") and (os.path.exists(os.path.join(root, new_p)) or os.path.exists(os.path.join(root, new_p + ".in"))):
                      return fmt(new_p)
          print("ERROR: ", fn, p)
          return m.group(0)
      new_c = re.sub(r'#include "([^"]+)"', repl, c)
      if new_c != c:
          print(fn)
          with open(fn, 'w') as f:
              f.write(new_c)
```

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14849

Reviewed By: dzhulgakov

Differential Revision: D13363445

Pulled By: ezyang

fbshipit-source-id: 52361f878a672785f9306c9e9ab2513128092b68
2018-12-08 19:38:30 -08:00
James Sun
186341c5dc Merge Caffe2 and PyTorch thread pool definitions (#14114)
Summary:
(1) Move Caffe2 thread pool to aten
(2) Use the same thread pool definition for PyTorch interpreter
(3) Make ivalue::Future thread-safe
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14114

Reviewed By: ilia-cher

Differential Revision: D13110451

Pulled By: highker

fbshipit-source-id: a83acb6a4bafb7f674e3fe3d58f7a74c68064fac
2018-11-28 18:10:20 -08:00
James Sun
d02781a2ef Make InterpresterStateImpl a intrusive_ptr_target (#13784)
Summary:
InterpresterStateImpl con continue its lifecycle by increment the ref
count itself. This patch also removes InterpresterState::clone()
interface that conflicts with intrusive_ptr_target that disallows copy.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13784

Differential Revision: D13015451

Pulled By: highker

fbshipit-source-id: a05f1ea6549d52ec693ccffefaa4d520b2474b8c
2018-11-09 23:39:18 -08:00
James Sun
dca3c2c60f Save and execute futures in a task queue (#13212)
Summary:
Upon calling wait(), save the forked thread and the current thread to a
task queue. A idling thread (which currently is single threaded) should
pick a ready task and run till there is nothing in the task queue.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13212

Differential Revision: D12884522

Pulled By: highker

fbshipit-source-id: b3942a0ee63c148e05f5f41bdc73007fa3c3368e
2018-11-09 01:46:35 -08:00
Peter Goldsborough
0479517325 Add modernize-* checks to clang-tidy (#13196)
Summary:
Enables almost all `modernize-*` checks in clang-tidy. This warns against things such as:

- Use of `const std::string&` instead of new-style `std::string` + move,
- Using old-style loops instead of range-for loops,
- Use of raw `new`
- Use of `push_back` instead of `emplace_back`
- Use of `virtual` together with `override` (`override` is sufficient)

ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13196

Differential Revision: D12891837

Pulled By: goldsborough

fbshipit-source-id: 4d0f782a09eb391ee718d3d66f74c095ee121c09
2018-11-02 20:30:40 -07:00
mruberry
6fe089c6ea Hierarchical device independent -> device specific architecture (#13108)
Summary:
This PR principally redesigns the fuser's logical flow to be hierarchical, with device-independent logic directing (relatively little) device-specific logic. This design is based on reviews of XLA, TVM, internal design review at NVIDIA and discussions with fuser owners at Facebook. To further vet the design I have begun developing the next significant PR (extended fusion logic) on top of this architecture and it has made the work significantly easier. This PR also improves fuser modularity, which should make it easier for others to contribute to. Unfortunately, this PR is large and its nature has made breaking it into smaller pieces challenging. Future PRs should be smaller.

The fusion flow is now:

- Fusions are "registered" and "upfront compilation" occurs. The fusion specifications, which includes the graph, go into a thread-safe device-independent cache. Upfront compilation generates some information used later during shape inference.
- Fusions are run, which passes them to an executor that performs shape inference, requests an instantiated fusion from the specification's thread-safe store, and launches them. Launch logic eventually defers to device-specific logic.
- Fusions not previously instantiated are compiled. Compilation is device-specific and arg-specific. Compilation logic eventually defers to device-specific logic.
- If the fusion could not be run because fusion on the requested device is disabled or shape inference fails a fallback is invoked.

This flow can be thought of as PyTorch IR -> Device-Independent Fusion Logic -> Device-Specific Fusion Logic. The current upstream logic is, by contrast, PyTorch IR -> Device-Specific Logic -> Device-Independent Logic, which results in needless code duplication and lack of conceptual clarity. That was my mistake when splitting the fuser off from the rest of the jit and our reviews since then have been incredibly helpful in understanding why the approach in this PR is better.

This PR does not only move code around. It also fixes few couple bugs and makes some logical/code changes.

Bug fixes:
- thread-safety is improved with caches preventing concurrent access
- the nvrtc version is now reviewed to determine the appropriate compute architecture to compile for, fixing a bug that would cause runtime errors if a user's nvrtc didn't support the compute architecture their gpu reported
- an issue with DeviceGuard not setting the device properly and failing silently is worked-around (ezyang mentioned he was reviewing the dynamic registration DeviceGuard uses, which may resolve the issue)

Code/Logical changes:
- "const" now appears many more places (note: I cast const away in operator.h because of some obscure build issues -- I think we should be able to fix this and will take a look while this goes through testing)
- The new flow allowed some redundant code to be removed (AnnotatedGraph is gone, for example, and the more straightforward flow eliminated duplication of effort elsewhere)
- Fallback logic is now also invoked if a fusion is requested on a device that cannot handle fusions
- Use of macros to determine which files are compiled is reduced (though they may come back if the Windows build is unhappy)
- There is no more "common" code or folder, the device-independent logic being at the forefront of the fuser replaces and improves upon the goal of sharing code

apaszke who I promised naming rights to
zdevito who correctly pointed out that the device-independent logic should be the bulk of what the fuser is doing
ngimel who contributed to the design of this architecture
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13108

Reviewed By: gchanan, fmassa

Differential Revision: D12850608

Pulled By: soumith

fbshipit-source-id: 24e2df6dfa97591ee36aeca8944519678c301fa3
2018-10-31 18:13:00 -07:00
Elias Ellison
59f8e8ada7 First step at adding exceptions (#12789)
Summary:
This is a first step towards adding exceptions. We need minimal support in order to begin converting the torch library to weak script mode (which is the main goal here).

Some limitations (that are documented in the tests & compiler):
1. Cannot assign exceptions to variables
2. Any name after raise is being treated as a valid Exception
3. No control flow analysis yet. Below a will be undefined:

if True:
     a = 1
else:
     raise Exception("Hi")
return a
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12789

Differential Revision: D12848936

Pulled By: eellison

fbshipit-source-id: 1f60ceef2381040486123ec797e97d65b074862d
2018-10-30 20:25:50 -07:00
Yangqing Jia
713e706618 Move exception to C10 (#12354)
Summary:
There are still a few work to be done:

- Move logging and unify AT_WARN with LOG(ERROR).
- A few header files are still being plumbed through, need cleaning.
- caffe2::EnforceNotMet aliasing is not done yet.
- need to unify the macros. See c10/util/Exception.h

This is mainly a codemod and not causing functional changes. If you find your job failing and trace back to this diff, usually it can be fixed by the following approaches:

(1) add //caffe2/c10:c10 to your dependency (or transitive dependency).
(2) change objects such as at::Error, at::Optional to the c10 namespace.
(3) change functions to the c10 namespace. Especially, caffe2::MakeString is not overridden by the unified c10::str function. Nothing else changes.

Please kindly consider not reverting this diff - it involves multiple rounds of rebasing and the fix is usually simple. Contact jiayq@ or AI Platform Dev for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/12354

Reviewed By: orionr

Differential Revision: D10238910

Pulled By: Yangqing

fbshipit-source-id: 7794d5bf2797ab0ca6ebaccaa2f7ebbd50ff8f32
2018-10-15 13:33:18 -07:00
Zachary DeVito
bd09ab6687 Remove stages from IR, they are not longer used
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12352

Differential Revision: D10219743

Pulled By: zdevito

fbshipit-source-id: 4d9441dc3748616f9b1f0734c65ec1a7abb0d663
2018-10-05 13:58:15 -07:00
David Riazati
d1ac1eba3b Add bool type to IR (#11834)
Summary:
This PR adds a bool type to `IValue` and puts it into place.

* changes conds for `prim::If` and `prim::Loop` to use `bool` type
* changes operators that take `bool`s to match their native ops
* fixes ambiguous `aten` ops `aten::std` and `aten::var`
	* fixes tests in `test_jit.py TestJitGenerated`
		```
		'test_std_dim',
		'test_std_dim_1d',
		'test_std_dim_1d_neg0',
		'test_std_dim_neg0',
		'test_var_dim',
		'test_var_dim_1d',
		'test_var_dim_1d_neg0',
		'test_var_dim_neg0'
		```
* adds `prim::BoolToTensor` and `prim::TensorToBool`

apaszke zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11834

Differential Revision: D9928570

Pulled By: driazati

fbshipit-source-id: 373c53df2f1a8ffa9e33d9a517002fbeef25f3eb
2018-10-03 12:40:03 -07:00
Michael Suo
7f35e92af2 mutable lists (#10700)
Summary:
This PR implements the design that we discussed. Changes:
- Added a World token IValue and type. The IValue is basically a dummy struct for now, in the future we may extend it (say, add thread-local state).
- Effectful ops explicitly declare they are mutable by having World tokens as inputs and outputs in their schema.
- Purely functional ops that use mutable values will get "fenced" and the world token will be threaded through the fences
- AnnotateEffects pass which wires up all the world tokens together.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10700

Reviewed By: eellison

Differential Revision: D9547881

Pulled By: michaelsuo

fbshipit-source-id: ebbd786c31f15bf45e2ddb0c188438ff2f5f3c88
2018-09-27 19:25:13 -07:00
Edward Yang
11bd2f2509 Retainable is no more (#11900)
Summary:
Stack:
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; **#11900 Retainable is no more**&nbsp;&nbsp;[💛](https://our.intern.facebook.com/intern/diff/D9977505/)
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; #11902 Refactor fastGet/fastSet for clarity, removing a null pointer check.&nbsp;&nbsp;[💛](https://our.intern.facebook.com/intern/diff/D9977654/)

Kill it with fire
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11900

Differential Revision: D9979779

Pulled By: ezyang

fbshipit-source-id: 0a437e7a0baadb6440e7dc39a01b4a406171faa7
2018-09-21 06:58:18 -07:00
Edward Yang
f6a6d7fae1 Switch at::TensorImpl to store TypeMeta rather than ScalarType
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11702

Reviewed By: cpuhrsch

Differential Revision: D9831384

fbshipit-source-id: 1b1233a70ed70b47a3dab4a5797b6cfcb7a2c265
2018-09-17 09:09:35 -07:00
Mike Ruberry
96d3f968eb Splits CPU and CUDA fusion compilers (#10981)
Summary:
This PR splits the CPU and CUDA fusion compilers, putting them into a new jit/fusers/ directory with jit/fusers/common for common components. In particular:

- A fusion interface is created that allows "fusion handles" to be requested
- The CPU and CUDA fusers implement this interface, with dispatch determined by device
- The fusion compilers, fusion function specializations and resource strings are split
- CPU-specific classes like TempFile and DynamicLibrary are in the CPU fuser
- Common classes likes TensorDesc and the base fusion function class are in jit/fusers/common
- There is still some specialization in jit/fusers/common, but these specializations are small(-ish)
- Updates the build system to remove the dummy interface on Windows and minimize the use of macros

This structure should allow in-flight PRs to easily rebase while providing a clear interface to the fusers.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10981

Reviewed By: soumith

Differential Revision: D9701999

Pulled By: apaszke

fbshipit-source-id: 3b6bec7b97e0444b2a93caa38d9b897f2e68c1b3
2018-09-14 14:05:34 -07:00
Gregory Chanan
cee743f639 Move backward/set_data to Type-based dispatch.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11440

Differential Revision: D9736565

Pulled By: gchanan

fbshipit-source-id: 1e66f54f1c87084f37c0b014030f0d6d2f8dfaee
2018-09-10 08:40:29 -07:00
Adam Paszke
00df09b65d Change specialization rules in GraphExecutors (#10977)
Summary:
**Review last commit only.** Stacked on top of #10949.

This commit fixes a number of issues connected to caching
differentiability status of graphs inside graph executors,
and changes the rules for optimization of differentiable subgraphs.
Previously every one of those was instantiated as a separate graph
executor, but now they are simply heavier-optimized graph regions,
and graph executors are only instantiated for their backward.

zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10977

Differential Revision: D9600626

Pulled By: apaszke

fbshipit-source-id: dad09a0f586e396afbd5406319c1cd54fbb8a3d3
2018-08-30 22:11:01 -07:00
Edward Yang
f7b02b3a68 Change Tensor/TensorImpl to use c10::intrusive_ptr (#10824)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10824

API additions:
- Tensor(c10::intrusive_ptr<TensorImpl,UndefinedTensor>&&)
- Tensor(const c10::intrusive_ptr<TensorImpl,UndefinedTensor>&)
- Tensor::operator=(Tensor&&) && (for completeness sake)
- TensorBase::unsafeGetTensorImpl()
- TensorBase::unsafeReleaseTensorImpl()
- TensorBase::getIntrusivePtr()
- TensorImpl::type_id()
- Tensor::set_data()
- Tensor::is_same(Tensor)
- Tensor::use_count()
- Tensor::type_id()
- Tensor::scalar_type()
- WeakTensor::is_same(WeakTensor)
- intrusive_ptr::weak_use_count()
- weak_intrusive_ptr::weak_use_count()
- c10::raw::intrusive_ptr::{incref,decref,make_weak}
- c10::raw::weak_intrusive_ptr::{incref,decref,lock}

API changes:
- Tensor::pImpl is no longer public (and now named tensor_impl_)
    - Most methods accessed this way are now accessible on Tensor
      maybe_zero_dim() and set_wrapped_number() being prominent exceptions
      (they are now accessed through unsafeGetTensorImpl())
- Type is no longer friend of Tensor
- TensorBase::reset(TensorImpl*) is deleted
- TensorBase::reset(TensorImpl*, bool should_retain) is deleted
- TensorBase::swap(TensorBaseImpl&) is deleted; use std::swap instead
- TensorBase::get() is deleted; use unsafeGetTensorImpl() instead
- TensorBase::detach() is deleted; use unsafeReleaseTensorImpl() instead
- TensorBase::retain() is deleted; use _raw_incref() instead
- TensorBase::release() is deleted; use _raw_decref() instead
- WeakTensor lost most of its methods (it no longer inherits from
  TensorBase)
- TensorImpl::storage() is now a const method
- Tensor(TensorBase) constructor removed, instead
  we go through getIntrusivePtr().  I'm not sure about
  this change; I happened to have accidentally removed the
  TensorBase constructor and decided to fix call sites,
  but I could go the other way.
- detail::set_data() is deleted; use Tensor::set_data() instead
- c10::raw_intrusive_ptr_target removed; use the functions in c10::raw instead.
  (The reason for this change, is that it is invalid to cast an intrusive_ptr_target*
  to a raw_intrusive_ptr_target* to take advantage of the methods. But there is
  no reason the incref/decref methods shouldn't also work on intrusive_ptr_target;
  it is primarily an API consideration. We can be more standards compliant by
  keeping them as functions, which are universally applicable.)
- intrusive_ptr::reclaim() and weak_intrusive_ptr::reclaim() now work on
  pointers of the NullType. (This counts as a bug fix, because the documentation
  specified that pointers produced by release() are valid to reclaim(), and
  a release() on a null intrusive_ptr produces the NullType::singleton())

Bug fixes:
- Dispatch code for mutable references incorrectly returned
  a reference to a value argument (which would immediately
  go out of scope).  They now correctly return a tensor by
  value.
- intrusive_ptr copy/move assignment did not work correctly when
  an object was assigned to itself. We now check for this case and
  no-op if so. (This bug manifested itself as a Tensor mysteriously
  becoming an UndefinedTensor after lines of code like
  'x = x.mul_(y)')

Other changes:
- The checked cast functions in Utils.h have now been
  renamed and detemplatized into checked unwrap functions.
- Added type_id() and scalar_type() methods to Tensor
- pImpl is no longer public
- Documented what the && overloads are doing
- All occurrences of 'new TensorImpl' (and similar spellings, like 'new THTensor')
  have been expunged. This is NO LONGER a valid way to create a new
  tensor, and if you do this, upon your first incref, you will catch an ASSERT
  failure saying that only tensors created by intrusive_ptr::release() are valid
  to reclaim(). Use c10::make_intrusive instead in this situation.
- IValue is adjusted to use intrusive_ptr instead of Retainable, and all
  other sub-classes of Retainable were modified to use intrusive_ptr.
  When doing this, I had to make the constructors of sub-classes like
  ConstantList public, so that c10::make_intrusive could invoke them.  Fortunately,
  if you incorrectly stack allocate a ConstantList, and then try to get an
  intrusive_ptr to it, it will fail, as stack allocated ConstantLists have refcount 0.
- IValue very narrowly sidesteps the problem of handling NullType, as it
  considers intrusive_ptr<TensorImpl> identical to intrusive_ptr<TensorImpl, UndefinedTensor>
  which is not always true. This was always the case, but there's now a comment
  explaining what's going on.

Some MSVC bugs were uncovered during the preparation of this patch.
They are documented as comments in the code.

Reviewed By: gchanan

Differential Revision: D9481140

fbshipit-source-id: 14a8ea0c231ed88b5715fb86d92730926f9f92fc
2018-08-27 16:11:01 -07:00
Adam Paszke
c8b246abf3 Prevent JIT from overspecializing to every single size configuration (#10844)
Summary:
Please review the expects carefully to make sure there are no regressions. I tried to go over them one by one when they changed, but it's sometimes easy to miss finer details.

Summary of changes:

- Renamed `TensorType` to `CompleteTensorType`. Added a new `TensorType` which records only the scalar type, number of dimensions, and device of a value. The argument behind the rename is to encourage people to use `CompleteTensorType` less, as most passes will only have limited information available. To make transition easier `complete_type->cast<TensorType>()` works, and makes our passes work with both kinds of specialization if they don't need extra the extra detail.
- Renamed `ArgumentSpec` to `CompleteArgumentSpec`. Added a new `ArgumentSpec`, which matches argument only at the level of the new `TensorType`.
- Shape analysis can process graphs with both `CompleteTensorType` and `TensorType`.
- Fuser was a part that heavily relied on full shape information being available. Now, we simply try to fuse the largest possible graphs, and have to do run-time checks to make sure they match the code we generate. If they don't, we fall back to regular interpretation. The shape checks are implementing using an optimized method exploiting algebraic properties of shapes with broadcasting, and the relations of broadcasting with pointwise ops. A full written proof of correctness of the shape checking algorithm is included in a comment in `graph_fuser.cpp`.

zdevito ezyang mruberry ngimel csarofeen
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10844

Differential Revision: D9498705

Pulled By: apaszke

fbshipit-source-id: 0c53c2fcebd871cc2a29c260f8d012276479cc61
2018-08-26 09:54:48 -07:00
Edward Yang
19031c68dc Use intrusive_ptr in Storage; replace unique_ptr<Storage> with Storage (#10488)
Summary:
```
Use intrusive_ptr in Storage; replace unique_ptr<Storage> with Storage

This patch does two major changes:

- It replaces the use of Retainable in Storage with a new implementation
  based on intrusive_ptr.  This will be necessary because Caffe2 will
  be using this class to implement intrusive_ptrs, and we need to
  line these up for the merge.  One good thing about the new implementation is
  that the default copy/move constructors/assignment operators and destructor
  work automatically, instead of needing to be hardcoded into Storage/Tensor.

- It replaces all places where we returned std::unique_ptr<Storage> with
  Storage, collapsing an unnecessary double indirection that is no longer
  necessary now that we have correctly working copy/move constructors.

I didn't initially want to do step (2), but it was very important to
eliminate all bare uses of new Storage and new StorageImpl, and this making
the API change was the most straightforward way to do this.

HOW TO FIX YOUR CODE IN THE NEW API

- You no longer need to dereference the result of tensor.storage() to pass
  it to set.  So, instead of:

      x.set_(*y.storage());

  just write:

      x.set_(y.storage());

- If you were accessing methods on StorageImpl via the pImpl() method, you
  must use the dot operator to run pImpl().  Even better; just drop pImpl,
  we now have method forwarding.  So, instead of:

      storage->pImpl()->data();

  just do:

      storage->data();
      // storage.pImpl()->data() works too but is not as recommended

- storage->getDevice() is no more; instead use storage->device().index()

MISC CODE UPDATES

- retain, release, weak_retain, weak_release and weak_lock are now
  reimplemented using the "blessed API", and renamed to make it
  clearer that their use is discouraged.

- nvcc OS X and general OS X portability improvements to intrusive_ptr

- A new comment in intrusive_ptr describing how stack allocated
  intrusive_ptr_targets work differently than heap allocated ones
  from c10::make_intrusive

CAVEAT EMPTOR

- THStorage_weakRetain used to work on strong pointers, but it NO LONGER
  works with intrusive_ptr.  You must reclaim the strong pointer into a
  real strong pointer, construct a weak pointer from it, and then release
  the strong and weak pointers.  See StorageSharing.cpp for an example.
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10488

Reviewed By: gchanan

Differential Revision: D9306134

Pulled By: ezyang

fbshipit-source-id: 02d58ef62dab8e4da6131e1a24834a65c21048e2
2018-08-21 21:39:55 -07:00
Gregory Chanan
00f2731112 Merge THTensor into TensorImpl
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10479

Differential Revision: D9315800

Pulled By: gchanan

fbshipit-source-id: b13ef0de3342600b02b54e0700eb02021a9d1a9e
2018-08-16 08:10:06 -07:00
Edward Yang
64235d5c01 Rewrite TensorImpl to use TensorTypeId. (#10278)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10278

Translation to Backend happens immediately before we go into the
Type universe; otherwise we use TensorTypeId.

I allocated TensorTypeId corresponding exactly to existing ATen
Backend.  Only CPUTensorId and CUDATensorId are relevant in the
Caffe2 universe.

Reviewed By: gchanan

Differential Revision: D9184060

fbshipit-source-id: 9d3989c26f70b90f1bbf98b2a96c57e2b0a46597
2018-08-13 11:20:04 -07:00