Commit Graph

180 Commits

Author SHA1 Message Date
CodemodService FBSourceClangFormatLinterBot
cbfce376a8 [AutoAccept][Codemod][FBSourceClangFormatLinter] Daily arc lint --take CLANGFORMAT
Reviewed By: zertosh

Differential Revision: D28319469

fbshipit-source-id: 8295597a8ee16b2fef3f7aacdd6c892cb22db988
2021-05-10 03:39:31 -07:00
Nikita Shulga
3a66a1cb99 [clang-tidy] Exclude cppcoreguidelines-avoid-magic-numbers (#57841)
Summary:
Add cppcoreguidelines-avoid-magic-numbers exclusion to clang-tidy
Remove existing nolint warnings using following script:
```
for file in `git ls-files | grep -v \.py`; do gsed '/^ *\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-magic-numbers)/d' -i  $file; done
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/57841

Reviewed By: samestep

Differential Revision: D28295045

Pulled By: malfet

fbshipit-source-id: 7c6e8d1213c9593f169ed3df6a916498f1a97163
2021-05-07 20:02:33 -07:00
Scott Wolchok
7870450706 [PyTorch] Use c10::ThreadLocal instead thread_local in record_function.cpp for specific __GLIBCXX__ on Android (#57689)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57689
* Older versions of libgnustd have issues with thread_local C++ qualifier on Android devices prior to r17+. Use c10::tls<> wrapper with smart pointer semantics in such cases.
* Convenient macro `C10_DEFINE_TLS_static` was added as well:

```
  // Define static TLS variable str_tls_ of type std::string
  C10_DEFINE_TLS_static(std::string, str_tls_);

  //////// Excercise it ////////
  {
     *str_tls_ = "abc";
     assert(str_tls_->length(), 3);
  }
```
ghstack-source-id: 128233742

Test Plan: CI +

Reviewed By: ilia-cher

Differential Revision: D27875779

fbshipit-source-id: 7764f96ac1e121051c6ea66eabcedb9ef54d290e
2021-05-06 00:13:33 -07:00
Scott Wolchok
44cc873fba [PyTorch] Autoformat c10 (#56830)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56830

Opt into formatting on GitHub and format everything. This is a trial run before turning on formatting for more and eventually all of the codebase.

Test Plan: CI

Reviewed By: zertosh

Differential Revision: D27979080

fbshipit-source-id: a80f0c48691c08ae8ca0af06377b87e6a2351151
2021-04-30 21:23:28 -07:00
Brad Fish
2c2aa9e030 Address temp file/bind race condition in torch_shm_manager (#57309)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57309

Addressing a race condition that can occur in `torch_shm_manager` between the time its temporary file is unlinked and when it `bind()`s the manager server socket to that same name. In that time window, other threads/processes can re-create another temporary file with the same name, causing `bind()` to fail with `EADDRINUSE`.

This diff introduces `c10::TempDir` and associated helper functions that mirror those of `c10::TempFile` and generates the manager socket name using a combination of a temporary directory, which will be valid for the lifetime of `torch_shm_manager`, and a well-known file name within that directory that will never be used outside of `bind()`.

Reviewed By: ejguan

Differential Revision: D28047914

fbshipit-source-id: 148d54818add44159881d3afc2ffb31bd73bcabf
2021-04-30 11:11:07 -07:00
Luca Wehrstedt
682476022f Introduce generic MultiStreamGuard (#57049)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57049

There was a comment above CUDAMultiStreamGuard which said "TODO: Implement this generically in c10". This is what I'm doing here.

The new generic MultiStreamGuard class is able to take a vector of device-agnostic c10::Streams and is able to support any device type (CUDA, but also ROCm and others) by using a VirtualGuardImpl. A class called CUDAMultiStreamGuard is still kept around, for convenience, and slightly for performance as it avoids a vtable lookup.
ghstack-source-id: 127713139

(Note: this ignores all push blocking failures!)

Test Plan: CI

Reviewed By: mrshenli

Differential Revision: D28029158

fbshipit-source-id: 2f3181371f8cb0d77a3b2e6aa510f1dd74e8f69b
2021-04-29 09:31:47 -07:00
Nikita Shulga
4cb534f92e Make PyTorch code-base clang-tidy compliant (#56892)
Summary:
This is an automatic change generated by the following script:
```
#!/usr/bin/env python3
from subprocess import check_output, check_call
import os

def get_compiled_files_list():
    import json
    with open("build/compile_commands.json") as f:
        data = json.load(f)
    files = [os.path.relpath(node['file']) for node in data]
    for idx, fname in enumerate(files):
        if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'):
            files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')]
    return files

def run_clang_tidy(fname):
    check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"])
    changes = check_output(["git", "ls-files", "-m"])
    if len(changes) == 0:
        return
    check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"])

def main():
    git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n")
    compiled_files = get_compiled_files_list()
    for idx, fname in enumerate(git_files):
        if fname not in compiled_files:
            continue
        if fname.startswith("caffe2/contrib/aten/"):
            continue
        print(f"[{idx}/{len(git_files)}] Processing {fname}")
        run_clang_tidy(fname)

if __name__ == "__main__":
    main()
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892

Reviewed By: H-Huang

Differential Revision: D27991944

Pulled By: malfet

fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179
2021-04-28 14:10:25 -07:00
Xiang Gao
0cc42809ce Enable skipped test for c10::complex on CUDA >= 11.2 (#50227)
Summary:
That test was skipped due to a compiler bug. That bug should be fixed in 11.2, so we should enable it.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50227

Reviewed By: malfet

Differential Revision: D27909195

Pulled By: anjali411

fbshipit-source-id: c802702079d0e521f53fc98cd0fc3ded0c12b455
2021-04-21 18:33:31 -07:00
Nikita Shulga
087049000b Make c10 clang-tidy clean (#55870)
Summary:
This change was autogenerated by running:
```
% find c10 -iname "*.cpp" -exec python3 tools/clang_tidy.py -c build -x {} -s \;
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/55870

Reviewed By: janeyx99

Differential Revision: D27728617

Pulled By: malfet

fbshipit-source-id: bede4d7f0c106d51394d1e9efddf01bf894421c5
2021-04-14 11:23:28 -07:00
Scott Wolchok
86368700e8 [PyTorch] Change MaybeOwned tests to use intrusive_ptr and Tensor (#55684)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55684

Upcoming changes to `MaybeOwned<T>` will require that T is
one of these two types and will have custom code for both.

This diff updates the tests to continue to build under these new
requirements; it is being sent separately to demonstrate that the
tests continue to work on the current implementation.
ghstack-source-id: 126405918

Test Plan: CI will run the rewritten tests.

Reviewed By: bhosmer

Differential Revision: D27630289

fbshipit-source-id: e38097d9ca04f3337cfa543ebcc8fb5d6916fcf3
2021-04-13 18:53:43 -07:00
Scott Wolchok
ea446ed600 [PyTorch] Allow copy operations on MaybeOwned (#55419)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55419

Turns out it's useful to have these. I chose to implement them in the straightforward safe way, rather than always borrowing.
ghstack-source-id: 126369328

Test Plan: Added more automated tests.

Reviewed By: hlu1

Differential Revision: D27545805

fbshipit-source-id: 84bb4458b86672ad340cc1f0aa18b80ca7ee13f1
2021-04-13 13:21:45 -07:00
Scott Wolchok
6fd875923e [PyTorch] Add MaybeOwned::operator*() && (#55244)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55244

Add the ability to move from the underlying object in a `MaybeOwned`.

FWIW, `MaybeOwned` is new territory for me personally and this move-and-dereference operation is even more so, but I think it makes sense and the tests pass.
ghstack-source-id: 126170046

Test Plan: Added automated tests.

Reviewed By: bhosmer

Differential Revision: D27522809

fbshipit-source-id: 82b180031e93d725209b6328f656315c232e5237
2021-04-09 22:14:59 -07:00
Hao Lu
93d0f636bb [c10] Add default constructor to Maybeowned (#55128)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55128

Test Plan: CI

Reviewed By: swolchok

Differential Revision: D27495079

fbshipit-source-id: 3bd01956a8b65170d6b38096dbd15c4809904f88
2021-04-02 06:42:04 -07:00
frdong
92770d25cd fix comparison of narrow type with wide type in loop condition (#53951)
Summary:
fix Semmle warning: Comparison of narrow type with wide type in loop condition

For example there is below piece of code:
for (int i=0; i<array.size(); ++i) {}

The problem is that array.size() return type is size_t can be larger type than int depending on the implementation so there is chance that i overflows (for very large array that array size is beyond the range of integer) and this loop will never be terminated.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/53951

Reviewed By: zou3519

Differential Revision: D27181495

Pulled By: malfet

fbshipit-source-id: 0612c5cedcdc656c193085e7fbb87dd163f20688
2021-03-22 16:40:35 -07:00
Scott Wolchok
0c8f16622b [Caffe2] Rework CAFFE_ENFORCE_THAT (#53303)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53303

The old code did a heap allocation unnecessarily and was a
little convoluted. I think that it was structured that way to avoid
double-evaluating arguments; I just forced them to be evaluated once
as though they were passed to a function by binding const references
to them.
ghstack-source-id: 123918262

Test Plan:
1) `buck run mode/opt-clang //caffe2/caffe2/fb/tests:logging_bench`

Before:
```
============================================================================
caffe2/caffe2/fb/tests/logging_bench.cpp        relative  time/iter  iters/s
============================================================================
glog_CHECK                                                   2.01ns  498.63M
caffe2_ENFORCE_GE                                 50.00%     4.01ns  249.31M
glog_CHECK_GE                                     17.39%    11.53ns   86.73M
fbcode_ENFORCE                                   100.00%     2.01ns  498.65M
caffe2_ENFORCE                                   100.00%     2.01ns  498.63M
caffe2_ENFORCE_THAT                               50.00%     4.01ns  249.33M
============================================================================
```

After:
```
============================================================================
caffe2/caffe2/fb/tests/logging_bench.cpp        relative  time/iter  iters/s
============================================================================
glog_CHECK                                                   2.01ns  498.63M
caffe2_ENFORCE_GE                                 97.44%     2.06ns  485.88M
glog_CHECK_GE                                     17.39%    11.53ns   86.73M
fbcode_ENFORCE                                   100.00%     2.01ns  498.65M
caffe2_ENFORCE                                   100.00%     2.01ns  498.65M
caffe2_ENFORCE_THAT                               97.28%     2.06ns  485.06M
============================================================================
```

Looks like about a 1.94x speedup!

2) Inspect generated assembly for logging_bench.cpp before & after by:
```
$ compile-commands caffe2/caffe2/fb/tests/logging_bench.cpp -f "mode/opt-clang"
$ jq -r '.[0].arguments | sh' < compile_commands.json | sed -e "s/'-c'/'-S'/g" | sed -E -e "s/'-g[12]'/'-g0'/g" > out.sh
$ sh out.sh
```

Then diff logging_bench.s as you like.

Before: P255408666
After: P277883307

Net about 1500 lines deleted from the assembly. We can see that the
happy path (which the benchmark tests) no longer contains string
creation.

Reviewed By: dzhulgakov

Differential Revision: D26829714

fbshipit-source-id: 6e11f8ea29292ae3d9f2cc89d08afcb06f7d39c9
2021-03-16 23:01:00 -07:00
Scott Wolchok
0606057af3 [PyTorch] Add c10::MaybeOwned and Tensor::expect_contiguous (#53317)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53317

This seems like it might help in cases where we have to call
`Tensor::contiguous`, but we expect that the tensor in question will
be contiguous a good portion of the time.
ghstack-source-id: 123203771

Test Plan:
Profiled AdIndexer on inline_cvr; time spent in
clip_ranges_gather_sigrid_hash_each_feature<int> was cut in half from
1.37% to 0.66%

Reviewed By: smessmer

Differential Revision: D26738036

fbshipit-source-id: b5db10783ccd103dae0ab3e79338a83b5e507ebb
2021-03-09 14:51:23 -08:00
Sam Estep
8c798e0622 Forbid trailing whitespace (#53406)
Summary:
Context: https://github.com/pytorch/pytorch/pull/53299#discussion_r587882857

These are the only hand-written parts of this diff:
- the addition to `.github/workflows/lint.yml`
- the file endings changed in these four files (to appease FB-internal land-blocking lints):
  - `GLOSSARY.md`
  - `aten/src/ATen/core/op_registration/README.md`
  - `scripts/README.md`
  - `torch/csrc/jit/codegen/fuser/README.md`

The rest was generated by running this command (on macOS):
```
git grep -I -l ' $' -- . ':(exclude)**/contrib/**' ':(exclude)third_party' | xargs gsed -i 's/ *$//'
```

I looked over the auto-generated changes and didn't see anything that looked problematic.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/53406

Test Plan:
This run (after adding the lint but before removing existing trailing spaces) failed:
- https://github.com/pytorch/pytorch/runs/2043032377

This run (on the tip of this PR) succeeded:
- https://github.com/pytorch/pytorch/runs/2043296348

Reviewed By: walterddr, seemethere

Differential Revision: D26856620

Pulled By: samestep

fbshipit-source-id: 3f0de7f7c2e4b0f1c089eac9b5085a58dd7e0d97
2021-03-05 17:22:55 -08:00
Scott Wolchok
efbb854ed8 [PyTorch] Avoid std::string in TORCH_CHECK when possible (#52221)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52221

The previous code forced a `std::string` to be created even when the default message or a user-provided string literal message was used. Now it's not forced and we don't need an outlined lambda in those cases either.
ghstack-source-id: 121877056

Test Plan:
Compare assembly for

```
#include <c10/util/Exception.h>

void f(bool b) {
  TORCH_CHECK(b, "message");
}

void g(bool b) {
  TORCH_CHECK(b);
}

void h(bool b) {
  TORCH_CHECK(b, "message", random());
}
```

before/after in fbcode optimized build.

Before: P174696735
After: P174696840

For `f()` and `g()`, we go from a call to an outlined lambda that did a bunch of `std::string` creation to a load of a string constant before calling `torchCheckFail`. This is a clear improvement.

For `h()`, results are mixed: we save a bunch of *extra* string goop in the outlined lambda and instead call `c10::detail::_str_wrapper` directly. This is good for overall size. However, we no longer outline the call to `random()`, which is less than ideal. I hope to recover the ability to fully outline the `random()` call in future diffs; this is just thorny enough that I don't want to cram even more into one diff.

Added automated test to make sure `TORCH_CHECK` and `TORCH_INTERNAL_ASSERT` only evaluate their arguments once.

Profiled AdIndexer mergenet benchmark in perf to check that `IValue::toTensor` is still getting inlined.

Reviewed By: bhosmer

Differential Revision: D26380783

fbshipit-source-id: 288860772423994ac739a8f33e2c09f718e8dd38
2021-02-18 07:51:53 -08:00
Nikita Shulga
497b772547 Add custom implementation for csqrt if libc++ is used (#52018)
Summary:
libc++ implements csqrt using polar form of the number, which results in higher numerical error, if `arg` is close to 0, pi/2, pi, 3pi/4

Fixes https://github.com/pytorch/pytorch/issues/47500

Pull Request resolved: https://github.com/pytorch/pytorch/pull/52018

Reviewed By: walterddr

Differential Revision: D26359947

Pulled By: malfet

fbshipit-source-id: 8c9f4dc45948cb29c43230dcee9b030c2642d981
2021-02-11 11:53:52 -08:00
Brian Hirsh
41bab9a4b6 Plumbing dispatch keys through the dispatcher (#49354)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49354

Test Plan: Imported from OSS

Reviewed By: smessmer

Differential Revision: D25614042

Pulled By: bdhirsh

fbshipit-source-id: 269a75e9a3ac518aa63bff2cafbd47ed2c4ff780
2021-02-08 11:09:51 -08:00
anjali411
09bc58796e Hashing logic for c10::complex (#51441)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51441

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D26170195

Pulled By: anjali411

fbshipit-source-id: 9247c1329229405426cfbd8463cabcdbe5bdb740
2021-02-01 15:56:44 -08:00
Richard Barnes
1b089c1257 Modernize for-loops (#50899)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50899

Test Plan: Sandcastle tests + OSS CI

Reviewed By: ezyang

Differential Revision: D26001931

fbshipit-source-id: d829d520f647aacd178e1c7a9faa6196cc5af54e
2021-01-29 10:52:31 -08:00
Richard Barnes
06c734d8c7 Generalize sum_intlist and prod_intlist, clean up dimensionality functions (#50495)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50495

Test Plan:
```
buck test mode/opt //caffe2/c10:c10_test_0
```

Reviewed By: ngimel

Differential Revision: D25902853

fbshipit-source-id: a7d30251ca443df57dd8005ed77dba7b2f1002d4
2021-01-19 22:35:55 -08:00
Richard Barnes
8e7402441d Move irange to c10 (#46414)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46414

For loops are often written with mismatched data types which causes silent type and sign coercion in the absence of integer conversion warnings. Getting around this in templated code requires convoluted patterns such as
```
for(auto i=decltype(var){0};i<var;i++)
```
with this diff we can instead write
```
for(const auto i = c10::irange(var))
```
Note that this loop is type-safe and const-safe.

The function introduced here (`c10::irange`) allows for type-safety and const-ness within for loops, which prevents the accidental truncation or modification of integers and other types, improving code safety.

Test Plan:
```
buck test //caffe2/c10:c10_test_0
```

Reviewed By: ngimel

Differential Revision: D24334732

fbshipit-source-id: fec5ebda3643ec5589f7ea3a8e7bbea4432ed771
2021-01-15 11:44:55 -08:00
Scott Wolchok
b73c018598 [PyTorch] Change representation of SizesAndStrides (#47508)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47508

This moves SizesAndStrides to a specialized representation
that is 5 words smaller in the common case of tensor rank 5 or less.
ghstack-source-id: 119313560

Test Plan:
SizesAndStridesTest added in previous diff passes under
ASAN + UBSAN.

Run framework overhead benchmarks. Looks more or less neutral.

Reviewed By: ezyang

Differential Revision: D24772023

fbshipit-source-id: 0a75fd6c2daabb0769e2f803e80e2d6831871316
2021-01-07 21:01:46 -08:00
Scott Wolchok
882ddb2f2d [PyTorch] Introduce packed SizesAndStrides abstraction (#47507)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47507

This introduces a new SizesAndStrides class as a helper for
TensorImpl, in preparation for changing its representation.
ghstack-source-id: 119313559

Test Plan:
Added new automated tests as well.

Run framework overhead benchmarks. Results seem to be neutral-ish.

Reviewed By: ezyang

Differential Revision: D24762557

fbshipit-source-id: 6cc0ede52d0a126549fb51eecef92af41c3e1a98
2021-01-07 20:56:50 -08:00
Samuel Marks
8aad66a7bd [c10/**] Fix typos (#49815)
Summary:
All pretty minor. I avoided renaming `class DestructableMock` to `class DestructibleMock` and similar such symbol renames (in this PR).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49815

Reviewed By: VitalyFedyunin

Differential Revision: D25734507

Pulled By: mruberry

fbshipit-source-id: bbe8874a99d047e9d9814bf92ea8c036a5c6a3fd
2021-01-01 02:11:56 -08:00
Sebastian Messmer
56a157fc79 hacky_wrapper_for_legacy_signatures reorders out arguments (#48911)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48911

This enables us to use hacky_wrapper_for_legacy_signatures for ops with out arguments so they can use templated unboxing logic without having to be rewritten.

This only actually enables it for one op as a proof of concept. There will be a separate PR enabling it for more ops.
ghstack-source-id: 118379659

Test Plan: waitforsandcastle

Reviewed By: bhosmer

Differential Revision: D25363336

fbshipit-source-id: da075d2cc58814f886a25d52652511dbbe990cec
2020-12-10 23:29:00 -08:00
Basil Hosmer
6b94830cdc faithful signature support in BoxedKernelWrapper (#47267)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47267

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D24701488

Pulled By: bhosmer

fbshipit-source-id: dbce246319670f9590c5762ad20c26cb24575fe8
2020-11-10 13:58:36 -08:00
Scott Wolchok
ae7063788c [Pytorch] Add basic c10::optional tests (#47014)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47014

Some tests are better than zero tests.
ghstack-source-id: 115769678

Test Plan: Run new tests, passes

Reviewed By: smessmer

Differential Revision: D24558649

fbshipit-source-id: 50b8872f4f15c9a6e1f39b945124a31b57dd61d9
2020-11-04 14:19:46 -08:00
Wanchao Liang
4a35280ec2 [c10] fix weak_intrusive_ptr lock() (#46007)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46007

When owner released the object, target will become null and illegal to
access refcount_ again. This PR fixes this and return null in that case.

Test Plan: Imported from OSS

Reviewed By: gmagogsfm

Differential Revision: D24374846

Pulled By: wanchaol

fbshipit-source-id: 741074f59c0904a4d60b7bde956cad2d0925be4e
2020-10-26 20:54:12 -07:00
Wanchao Liang
e8ff0f6c5c [c10] add operator= of intrusive_ptr to weak_intrusive_ptr (#44045)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44045

Test Plan: Imported from OSS

Reviewed By: gmagogsfm

Differential Revision: D23632281

Pulled By: wanchaol

fbshipit-source-id: ea50427fc261f0c77ddaac2e73032827320d7077
2020-10-17 03:35:44 -07:00
Ailing Zhang
a47e3697ab Use iterator of DispatchKeySet. (#44682)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44682

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D23698387

Pulled By: ailzhang

fbshipit-source-id: 4fa140db9254c2c9c342bf1c8dfd952469b0b779
2020-09-18 13:34:27 -07:00
Caleb Thomas
dd4bbe1a79 Add iterator like functionality for DispatchKeySet (#44066)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44066

Add STL Input iterator to DispatchKeySet:
* Iterator is able to iterate from first not undefined DispatchKey
to NumDispatchKeys.
* Iterator is invalidated once underlying DispatchKeySet is invalidated

Note see http://www.cplusplus.com/reference/iterator/ for comparisons of
different iterators.

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D23611405

Pulled By: linux-jedi

fbshipit-source-id: 131b287d60226a1d67a6ee0f88571f8c4d29f9c3
2020-09-11 15:08:15 -07:00
Christian Puhrsch
24199e0768 tuple_map / tuple_concat (#42326)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42326

ghstack-source-id: 108868289

Test Plan: Unit tests

Reviewed By: smessmer

Differential Revision: D22846504

fbshipit-source-id: fa9539d16e21996bbd80db3e3c524b174b22069e
2020-08-03 19:19:47 -07:00
Sebastian Messmer
3a19af2427 Make operators with optional Tensor? arguments c10-full (#41610)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41610

Previously, operators that have a `Tensor?` (i.e. optional tensor) in their schema implemented it using `Tensor` in C++ and filled in an undefined tensor for the None case.
The c10 operator library, however, expects `Tensor?` to be represented as `optional<Tensor>`, so those operators couldn't be c10-full yet and still had to use codegenerated unboxing instead of templated unboxing.

This PR changes that. It extends the `hacky_wrapper_for_legacy_signatures` to not only take case of TensorOptions, but now also map between signatures taking `Tensor` and `optional<Tensor>`.
For this, it requires an additional template parameter, the expected signature, and it uses that to go argument-by-argument and unwrap any optionals it finds.
ghstack-source-id: 108873701

Test Plan: waitforsandcastle

Reviewed By: bhosmer

Differential Revision: D22607879

fbshipit-source-id: 57b2fb01a294b804f82cd55cd70f0ef4a478e14f
2020-07-31 16:09:08 -07:00
Basil Hosmer
029007c8b6 Improved coverage for unboxed->boxed kernel wrappers (#38999)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38999

Adds boxing for inplace and outplace kernels, itemizes
remaining unsupported cases, and fails compilation when
new unsupported types are introduced in op signatures.

Test Plan: Imported from OSS

Differential Revision: D21718547

Pulled By: bhosmer

fbshipit-source-id: 03295128b21d1843e86789fb474f38411b26a8b6
2020-07-29 11:31:16 -07:00
Sebastian Messmer
0494e0ad70 Back out "Revert D21581908: Move TensorOptions ops to c10" (#40595)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40595

ghstack-source-id: 106691774

Test Plan: waitforsandcastle

Differential Revision: D22247729

fbshipit-source-id: 14745588cae267c1e0cc51cd9541a9b8abb830e5
2020-06-26 12:57:09 -07:00
Xiang Gao
c7d79f35e3 Header rename complex_type.h -> complex.h (#39885)
Summary:
This file should have been renamed as `complex.h`, but unfortunately, it was named as `complex_type.h` due to a name clash with FBCode. Is this still the case and is it easy to resolve the name clash? Maybe related to the comment at https://github.com/pytorch/pytorch/pull/39834#issuecomment-642950012
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39885

Differential Revision: D22018575

Pulled By: ezyang

fbshipit-source-id: e237ccedbe2b30c31aca028a5b4c8c063087a30f
2020-06-23 16:27:09 -07:00
Sebastian Messmer
581ad48806 Revert D21581908: Move TensorOptions ops to c10
Test Plan: revert-hammer

Differential Revision:
D21581908

Original commit changeset: 6d4a9f526fd7

fbshipit-source-id: fe1e6368a09120ea40dea405e8409983541e3cb5
2020-06-23 16:10:07 -07:00
Sebastian Messmer
b623bdeabb Move TensorOptions ops to c10 (#39492)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39492

This PR adds use_c10_dispatcher: full to ops taking TensorOptions. To allow this, since the c10 operator library doesn't know about TensorOptions, we need to register the operator kernels as optional<ScalarType>, optional<Device>, optional<Layout>, optional<bool> instead, and also call them this way.

Changes:

Add use_c10_dispatcher: full to those ops
Write hacky_wrapper_for_legacy_signatures which takes an old-style kernel (i.e. one written to take TensorOptions) an creates a wrapper kernel for it that takes the scattered optional<ScalarType>, optional<Device>, optional<Layout>, optional<bool> instead.
Change codegen so that all op registrations are wrapped into hacky_wrapper_for_legacy_signatures. This is added to all ops but is a no-op if the op doesn't take TensorOptions. This allows us in the future to just change a kernel signature from TensorOptions to the scattered version and have it work without having to touch codegen.
Change codegen so that the frontend calls those operators with expanded arguments instead of with a TensorOptions object. This is required because now the kernels are written in this way.
This PR does not remove TensorOptions special cases from codegen, but instead it separates kernels from the codegen/frontend issues. After this, kernels can be worked on separately without having to touch codegen and codegen can be worked on without having to touch kernels.

Codegen diff: P133121032

ghstack-source-id: 106426630

Test Plan: waitforsandcastle

Differential Revision: D21581908

fbshipit-source-id: 6d4a9f526fd70fae40581bf26f3ccf794ce6a89e
2020-06-23 14:13:34 -07:00
Gao, Xiang
dea58a7660 [resubmit] Kill thrust::complex from log kernels (#40079)
Summary:
Use `::log` instead of `std::log` for better ROCm support.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40079

Differential Revision: D22068554

Pulled By: pbelevich

fbshipit-source-id: a458ae34535a641832f816617387a45445e2fa48
2020-06-17 05:57:10 -07:00
Sebastian Messmer
f69b72c738 Back out "Revert D21986243: TORCH_FN" (#40110)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40110

Original commit changeset: 72c690c2b4c2
ghstack-source-id: 105993222

Test Plan: waitforsandcastle

Differential Revision: D22072829

fbshipit-source-id: 0bc1a3e389e2afb05688c472793d34eaddb67f2a
2020-06-16 13:38:29 -07:00
Mike Ruberry
8939849f72 Revert D21986243: TORCH_FN
Test Plan: revert-hammer

Differential Revision:
D21986243

Original commit changeset: a123571c18aa

fbshipit-source-id: 72c690c2b4c2fc39e1c9192d1c410f49bb4077a5
2020-06-16 04:43:46 -07:00
Sebastian Messmer
12cb80b5b8 TORCH_FN (#39823)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39823

Add a compile time function pointer that can be used to pass function pointers in template args.
This is very useful for metaprogramming function wrappers.
ghstack-source-id: 105944072

Test Plan: waitforsandcastle

Differential Revision: D21986243

fbshipit-source-id: a123571c18aa0e65908cbb131f28922ceb59061c
2020-06-16 03:08:08 -07:00
Xiang Gao
eb358f49c2 Overload complex math functions on both :: and std:: (#39829)
Summary:
Because ROCm has bug on std:: functions.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39829

Differential Revision: D22018430

Pulled By: anjali411

fbshipit-source-id: 671e158d3e3342394d1deaebd7ff011cce94c31a
2020-06-15 16:53:16 -07:00
Pavel Belevich
cf64af1ad2 Revert D22036002: [pytorch][PR] Kill thrust::complex from log kernels
Test Plan: revert-hammer

Differential Revision:
D22036002

Original commit changeset: 8852a833a0c7

fbshipit-source-id: 36d3c8d0e489f8a11a6e3e9d1ae162c192748037
2020-06-14 15:30:48 -07:00
Xiang Gao
4947ee3811 Kill thrust::complex from log kernels (#39902)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39902

Differential Revision: D22036002

Pulled By: pbelevich

fbshipit-source-id: 8852a833a0c71343ae630754f00da35a66e05917
2020-06-14 11:44:28 -07:00
Xiang Gao
78acc9dffb Check reinterpret_cast of complex bidirectional (#38882)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38882

Differential Revision: D21690131

Pulled By: anjali411

fbshipit-source-id: 5634f79e5a0248843625bb4eb69e854359e5d7ef
2020-05-28 09:09:39 -07:00
Hong Xu
000fea375c Support operations on c10::complex and integer scalars (#38418)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38418

This is useful in reducing verbosity in c10::complex's general usage, and potentially also offers
performance benefits.

This brings back #34506 (which was made for std::complex).

Differential Revision: D21587012

Test Plan: Imported from OSS

Pulled By: malfet

fbshipit-source-id: 6dd10c2f417d6f6d0935c9e1d8b457fd29c163af
2020-05-14 22:23:14 -07:00
Sebastian Messmer
63c3b89c1c Simplify code with decltype(auto) (#30922)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30922

New c++14 feature we can use now
ghstack-source-id: 103767403

Test Plan: waitforsandcastle

Differential Revision: D18869644

fbshipit-source-id: 54541c8004b2116386668a31eb9b0410a603b7dc
2020-05-11 21:31:18 -07:00
Sebastian Messmer
379e717a1b Back out "Revert D18927220: if_constexpr for C++14" (#37792)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37792

Original commit changeset: a1b8755a2790
ghstack-source-id: 103609715

Test Plan: waitforsandcastle

Differential Revision: D21389755

fbshipit-source-id: 1a3c74295dbfbf07fe225be9bcd47d11e31a20fa
2020-05-07 15:20:55 -07:00
Edward Yang
a058e938f9 Refactor error msg stack handling, add TORCH_RETHROW (#37101)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37101

Fixes #36954.

The basic concept is to streamline the process of rethrowing
c10::Error with extra error information.  This is in a few
steps:

- I completely remodeled the Error data type and the internal
  invariants.  Instead of manually adding in newlines, the
  message stack formatting process is responsible for inserting
  newlines and spacing as necessary.  Call sites are then
  modified to respect the new API model.
- TORCH_RETHROW macro is added, which adds context to an error
  message and then rethrows it.

New internal assert failure looks like:

```
0 INTERNAL ASSERT FAILED at ../c10/test/util/exception_test.cpp:64, please report a bug to PyTorch.
Exception raised from TestBody at ../c10/test/util/exception_test.cpp:64 (most recent call first):
frame #0: <unknown function> + 0x6aab9 (0x7ff611d3aab9 in /data/users/ezyang/pytorch-tmp/build/lib/libc10.so)
frame #1: ...
```

Error message with context looks like:

```
This is an error
  This is context 1
  This is context 2
```

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D21202891

Pulled By: ezyang

fbshipit-source-id: 361cadd16bc52e5886dba08e79277771ada76169
2020-05-04 11:56:45 -07:00
Edward Yang
efd8f70cac Make msg() and msg_with_backtrace() private (#37094)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37094

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D21202892

Pulled By: ezyang

fbshipit-source-id: d59e6bffabd90cc734056bdce2cd1fe63262fab8
2020-05-04 11:54:34 -07:00
Gao, Xiang
c5624e831d Add overloads of std:: math functions for c10::complex [resubmit] (#37468)
Summary:
This reverts commit d167a7f654.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37468

Differential Revision: D21305110

Pulled By: anjali411

fbshipit-source-id: d1bdc9d9feac00331fc2b2b905d49f80bef680f9
2020-04-30 10:20:45 -07:00
Lu Fang
d167a7f654 Revert D21256854: [pytorch][PR] Add overloads of std:: math functions for c10::complex
Test Plan: revert-hammer

Differential Revision:
D21256854

Original commit changeset: 2112ba6b7992

fbshipit-source-id: b81c377f9cd33a493a63d1e666cbe6765516fca8
2020-04-27 13:23:34 -07:00
Gao, Xiang
6d409481b3 Add overloads of std:: math functions for c10::complex (#35725)
Summary:
Issue: https://github.com/pytorch/pytorch/issues/35284

~This depends on and contains https://github.com/pytorch/pytorch/pull/35524. Please review after the dependency gets merged and I will rebase to get a clean diff.~

The implementation of most functions follow the pattern

```C++
template<typename T>
C10_HOST_DEVICE c10::complex<T> some_function(c10::complex<T> x) {
#if defined(__CUDACC__) || defined(__HIPCC__)
  return static_cast<c10::complex<T>>(thrust::some_function(static_cast<thrust::complex<T>>(x)));
#else
  return static_cast<c10::complex<T>>(std::some_function(static_cast<std::complex<T>>(x)));
#endif
}
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35725

Differential Revision: D21256854

Pulled By: ezyang

fbshipit-source-id: 2112ba6b79923450feafd7ebdc7184a3eaecadb6
2020-04-27 10:32:16 -07:00
Mike Ruberry
b428f454e1 Revert D18927220: if_constexpr for C++14
Test Plan: revert-hammer

Differential Revision:
D18927220

Original commit changeset: 19a135e00af6

fbshipit-source-id: a1b8755a27903b98b742881b3ecce4f5e99543b2
2020-04-26 04:27:53 -07:00
Sebastian Messmer
f5e6f1f333 if_constexpr for C++14 (#31091)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31091

This implements a C++17 "if constexpr" like feature for C++14.
This can be used, for example, to replace SFINAE or to force the compiler to remove some parts of a function in the assembly based on a condition.
PRs stacked on top will use this to simplify some of our template metaprogramming.
ghstack-source-id: 102867141

Test Plan: unit tests

Differential Revision: D18927220

fbshipit-source-id: 19a135e00af6ebb0139ce3730353762d4512158f
2020-04-25 11:31:51 -07:00
Xiang Gao
20328f67bb Add core of c10::complex [resubmit] (#36626)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36626

This reverts commit 9216c67c9e.

Test Plan: Imported from OSS

Differential Revision: D21140441

Pulled By: anjali411

fbshipit-source-id: 488530088e2ff87dc27e70d21ace88ff2967e7ab
2020-04-24 12:08:23 -07:00
Dmytro Dzhulgakov
50a1850d8d [pytorch] Route default warning sync to LOG(WARNING) - second try (#36984)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36984

Follow LOG(WARNING) format for c++ side warnings in order to play well with larger services, especially when using glog. I need to hook up into GLOG internals a bit in order to override FILE/LINE without having to change the whole thing to be macros, but it seems to be stable between glog versions.

Note, this also changes caffe2_log_level to warning by default - I think it's a much better default when compiling without glog (or maybe even have info).

With glog output, stderr capture doesn't work any more in tests. That's why we instead use c10-level warnings capture.

Test Plan:
Run unittest in both glog and non-glog build mode:

glog:
```
W0416 12:06:49.778215 3311666 exception_test.cpp:23] Warning: I'm a warning (function TestBody)
```

no-glog:
```
[W exception_test.cpp:23] Warning: I'm a warning (function TestBody)
```

Reviewed By: ilia-cher

Differential Revision: D21151351

fbshipit-source-id: fa926d9e480db5ff696990dad3d80f79ef79f24a
2020-04-23 01:08:00 -07:00
Dmytro Dzhulgakov
30e7055ed7 Revert D21078446: [pytorch] Route default warning sync to LOG(WARNING)
Test Plan: revert-hammer

Differential Revision:
D21078446

Original commit changeset: b5d36aac54d6

fbshipit-source-id: adff2d7e396b2efdd29eeabfe393fbc55edbe635
2020-04-20 00:26:56 -07:00
Dmytro Dzhulgakov
9d5dda7c2f [pytorch] Route default warning sync to LOG(WARNING) (#36768)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36768

Follow LOG(WARNING) format for c++ side warnings in order to play well with larger services, especially when using glog. I need to hook up into GLOG internals a bit in order to override FILE/LINE without having to change the whole thing to be macros, but it seems to be stable between glog versions.

Note, this also changes caffe2_log_level to warning by default - I think it's a much better default when compiling without glog (or maybe even have info)

Test Plan:
Run unittest in both glog and non-glog build mode:

glog:
```
W0416 12:06:49.778215 3311666 exception_test.cpp:23] Warning: I'm a warning (function TestBody)
```

no-glog:
```
[W exception_test.cpp:23] Warning: I'm a warning (function TestBody)
```

Reviewed By: ilia-cher

Differential Revision: D21078446

fbshipit-source-id: b5d36aac54d6b6295a72de6754696ccafbcb84ca
2020-04-19 23:02:55 -07:00
Mike Ruberry
9216c67c9e Revert D21021677: [pytorch][PR] Add core of c10::complex
Test Plan: revert-hammer

Differential Revision:
D21021677

Original commit changeset: 9e144e581fa4

fbshipit-source-id: ce6a88fc71ec0134d0fc6ecdddc4c4db35f89b1f
2020-04-14 13:58:24 -07:00
Xiang Gao
25252816cf Add core of c10::complex (#35524)
Summary:
Step 0 of https://github.com/pytorch/pytorch/issues/35284

Reference: https://en.cppreference.com/w/cpp/numeric/complex
We are targeting C++20. The difference across C++ versions are mostly `constexpr` qualifiers, newer version has more function declared as `constexpr`

This PR adds the core of `c10::complex`, it includes
- standard constructors as in `std::complex`
- explicit conversion constructors converting from `std/thrust::complex` to `c10::complex`
- standard assignment operators as in `std::complex`
- conversion assignment operators converting from `std/thrust::complex` to `c10::complex`
- other standard operators as in `std::complex`
- standard methods as in `std::complex`
- explicit casting operators to std/thrust
- basic non-member functions as in `std::complex`:
  - arithmetic operators
  - `==`, `!=`
  - `<<`, `>>`
  - `std::real`, `std::imag`, `std::abs`, `std::arg`, `std::norm`, `std::conj`, `std::proj`, `std::polar`
    - Some of them are intentionally not completely implemented, these are marked as `TODO` and will be implemented in the future.

This PR does not include:
- overload of math functions

which will come in the next PR
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35524

Differential Revision: D21021677

Pulled By: anjali411

fbshipit-source-id: 9e144e581fa4b2bee62d33adaf756ce5aadc0c71
2020-04-14 11:00:24 -07:00
Sebastian Messmer
7b9ab91614 Improve boxed dispatch performance (#33313)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33313

Instead of just remembering the number of arguments and iterating over the stack,
the DispatchKeyExtractor now remembers the exact locations of the dispatch relevant arguments
(i.e. Tensor arguments) and only looks at those.
ghstack-source-id: 101908386

Test Plan: unit tests, benchmarks

Differential Revision: D19748549

fbshipit-source-id: b5b9ff2233b3507e0b600460f422912cfa9e3f0f
2020-04-11 12:04:27 -07:00
Xiang Gao
15c7486416 Canonicalize includes in c10, and add tests for it (#36299)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36299

Test Plan: Imported from OSS

Differential Revision: D20943005

Pulled By: ezyang

fbshipit-source-id: 9dd0a58824bd0f1b5ad259942f92954ba1f63eae
2020-04-10 12:07:52 -07:00
Edward Yang
3f3b96b1f8 Revert D20735881: [pytorch][PR] [WIP] [reland][pytorch][PR] Fix some incorrect annotation…
Test Plan: revert-hammer

Differential Revision:
D20735881

Original commit changeset: d21e940380f0

fbshipit-source-id: fb50a099320bfac92c9b8e1ca12cdc50d302342f
2020-03-30 12:28:27 -07:00
peter
e7a37823b0 [WIP] [reland][pytorch][PR] Fix some incorrect annotation… (#35588)
Summary:
…s found by clang-cl"

This reverts commit a9b540d109.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35588

Differential Revision: D20735881

Pulled By: ezyang

fbshipit-source-id: d21e940380f0c1b9b9b84e9cc892985fd3ad0ac3
2020-03-30 11:42:19 -07:00
Nikita Shulga
2ef5b947a8 Disable unit test failing on Windows (#35549)
Summary:
Introduce DISABLED_ON_WINDOWS macro, that adds `DISABLED_` prefix to string if compiled for Win32
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35549

Test Plan: CI

Differential Revision: D20700915

Pulled By: malfet

fbshipit-source-id: adddfe2db89b7139093ceef6899862bce0adcf2d
2020-03-27 19:20:29 -07:00
Nikita Shulga
a9b540d109 Revert D20670031: [pytorch][PR] Fix some incorrect annotations found by clang-cl
Test Plan: revert-hammer

Differential Revision:
D20670031

Original commit changeset: cd8018dee703

fbshipit-source-id: 6900bf46346f0f415812607e5eff67259fc7b478
2020-03-27 18:26:01 -07:00
peter
45c9ed825a Formatting cmake (to lowercase without space for if/elseif/else/endif) (#35521)
Summary:
Running commands:
```bash
shopt -s globstar

sed -e 's/IF (/if(/g' -e 's/IF(/if(/g' -e 's/if (/if(/g' -e 's/ELSE (/else(/g' -e 's/ELSE(/else(/g' -e 's/else (/else(/g' -e 's/ENDif(/endif(/g' -e 's/ELSEif(/elseif(/g' -i CMakeLists.txt
sed -e 's/IF (/if(/g' -e 's/IF(/if(/g' -e 's/if (/if(/g' -e 's/ELSE (/else(/g' -e 's/ELSE(/else(/g' -e 's/else (/else(/g' -e 's/ENDif(/endif(/g' -e 's/ELSEif(/elseif(/g' -i caffe2/**/CMakeLists.txt
sed -e 's/IF (/if(/g' -e 's/IF(/if(/g' -e 's/if (/if(/g' -e 's/ELSE (/else(/g' -e 's/ELSE(/else(/g' -e 's/else (/else(/g' -e 's/ENDif(/endif(/g' -e 's/ELSEif(/elseif(/g' -i torch/**/CMakeLists.txt
sed -e 's/IF (/if(/g' -e 's/IF(/if(/g' -e 's/if (/if(/g' -e 's/ELSE (/else(/g' -e 's/ELSE(/else(/g' -e 's/else (/else(/g' -e 's/ENDif(/endif(/g' -e 's/ELSEif(/elseif(/g' -i c10/**/CMakeLists.txt
sed -e 's/IF (/if(/g' -e 's/IF(/if(/g' -e 's/if (/if(/g' -e 's/ELSE (/else(/g' -e 's/ELSE(/else(/g' -e 's/else (/else(/g' -e 's/ENDif(/endif(/g' -e 's/ELSEif(/elseif(/g' -i cmake/**/*.cmake
sed -e 's/IF (/if(/g' -e 's/IF(/if(/g' -e 's/if (/if(/g' -e 's/ELSE (/else(/g' -e 's/ELSE(/else(/g' -e 's/else (/else(/g' -e 's/ENDif(/endif(/g' -e 's/ELSEif(/elseif(/g' -i cmake/**/*.cmake.in
```
We may further convert all the commands into lowercase according to the following issue: 77543bde41.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35521

Differential Revision: D20704382

Pulled By: malfet

fbshipit-source-id: 42186b9b1660c34428ab7ceb8d3f7a0ced5d2e80
2020-03-27 14:25:17 -07:00
peter
0c16cedafe Fix some incorrect annotations found by clang-cl (#35364)
Summary:
Fixes incorrect usages of symbol annotations including:
1. Exporting or importing a function/class in an anonymous namespace.
2. Exporting or importing a function/class implementation in a header file. However, by removing the symbol annotations, they are now local symbols. If they need to be remain global, I can move the implementations to the source file.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35364

Differential Revision: D20670031

Pulled By: ezyang

fbshipit-source-id: cd8018dee703e2424482c27fe9608e040d8105b8
2020-03-27 10:40:04 -07:00
Edward Yang
3d0a470d89 Rename DispatchKey::UndefinedTensorId to Undefined (#32728)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32728

It doesn't have much to do with tensors anymore.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D19628093

Pulled By: ezyang

fbshipit-source-id: 4d57111cdf44ba347bec8a32bb5b4b47a83c1eaf
2020-01-30 11:47:40 -08:00
Brian Wignall
f326045b37 Fix typos, via a Levenshtein-type corrector (#31523)
Summary:
Should be non-semantic.

Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos, with https://github.com/bwignall/typochecker to help automate the checking.

Uses an updated version of the tool used in https://github.com/pytorch/pytorch/pull/30606 .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31523

Differential Revision: D19216749

Pulled By: mrshenli

fbshipit-source-id: 7fd489cb9a77cd7e4950c1046f925d57524960ea
2020-01-17 16:03:19 -08:00
Pavel Belevich
62b06b9fae Rename TensorTypeId to DispatchKey (#32154)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32154

TensorTypeId -> DispatchKey
	c10/core/TensorTypeId.h -> c10/core/DispatchKey.h
	c10/core/TensorTypeId.cpp -> c10/core/DispatchKey.cpp
	TensorTypeId::* -> DispatchKey::*
	TensorTypeId type_id -> DispatchKey dispatch_key
		type_id -> dispatch_key
	TensorTypeId::NumTensorIds -> DispatchKey::NumDispatchKeys
	RealTensorTypeId -> RealDispatchKey
TensorTypeSet -> DispatchKeySet
	TensorTypeIds -> DispatchKeys
	c10/core/TensorTypeSet.h -> c10/core/DispatchKeySet.h
	c10/core/TensorTypeSet.cpp -> c10/core/DispatchKeySet.cpp
	type_set() -> key_set()
	type_set_ -> key_set_
	typeSet -> keySet
ExcludeTensorTypeIdGuard -> ExcludeDispatchKeyGuard
IncludeTensorTypeIdGuard -> IncludeDispatchKeyGuard
LocalTensorTypeSet -> LocalDispatchKeySet
	c10/core/impl/LocalTensorTypeSet.h -> c10/core/impl/LocalDispatchKeySet.h
	c10/core/impl/LocalTensorTypeSet.cpp -> c10/core/impl/LocalDispatchKeySet.cpp
	tls_local_tensor_type_set -> tls_local_dispatch_key_set
	tls_is_tensor_type_id_excluded -> tls_is_dispatch_key_excluded
	tls_set_tensor_type_id_excluded -> tls_set_dispatch_key_excluded
	tls_is_tensor_type_id_included -> tls_is_dispatch_key_included
	tls_set_tensor_type_id_included -> tls_set_dispatch_key_included
MultiDispatchTensorTypeSet -> MultiDispatchKeySet
	multi_dispatch_tensor_type_set -> multi_dispatch_key_set
tensorTypeIdToBackend -> dispatchKeyToBackend
backendToTensorTypeId -> backendToDispatchKey
initForTensorTypeSet -> initForDispatchKeySet
inferred_type_set -> inferred_key_set
computeTensorTypeId -> computeDispatchKey
PODLocalTensorTypeSet raw_local_tensor_type_set -> PODLocalDispatchKeySet raw_local_dispatch_key_set
get_default_tensor_type_id -> get_default_dispatch_key
inferred_type_id -> inferred_dispatch_key
actual_type_id -> actual_dispatch_key
typeSetToDispatchKey_ -> dispatchKeySetToDispatchKey_
get_type_id() -> get_dispatch_key()
legacyExtractTypeId -> legacyExtractDispatchKey
extractTypeId -> extractDispatchKey

Test Plan: Imported from OSS

Differential Revision: D19398900

Pulled By: pbelevich

fbshipit-source-id: 234ad19f93d33e00201b61e153b740a339035776
2020-01-15 11:16:08 -08:00
Sebastian Messmer
f67851d69a Fix c10::util::get_fully_qualified_type_name for MSVC (#31313)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31313

This is a bugfix. The reason we couldn't enable the constexpr-ness for it before is that it was buggy,
and without constexpr it crashed at runtime and not at compile time which seems to have passed our CI unfortunately...
ghstack-source-id: 96380160

Test Plan: Now it works even when enabling constexpr for it

Differential Revision: D19087471

fbshipit-source-id: 28be107389f4507d35d08eab4b089a405690529b
2020-01-08 09:11:10 -08:00
Edward Yang
9116f02beb Rename TORCH_DCHECK to TORCH_INTERNAL_ASSERT_DEBUG_ONLY (#31917)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31917

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D19301480

Pulled By: ezyang

fbshipit-source-id: fcce8868733965b9fbd326b4ec273135759df377
2020-01-07 17:28:47 -08:00
Sebastian Messmer
c21f89970f Remove c++14-conditional constexpr (#30916)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30916

These macros said "make it constexpr if we're in C++14". Since we're now always C++14, we can just say "constexpr" isntead.
ghstack-source-id: 96369584

Test Plan: waitforsandcastle

Differential Revision: D18869635

fbshipit-source-id: f41751e4e26fad6214ec3a98db2d961315fd73ff
2020-01-07 16:40:11 -08:00
Junjie Bai
489dd6cb90 Add TORCH_DCHECK macro that checks only in debug builds (#31240)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31240

Follow up on discoveries/discussions in https://github.com/pytorch/pytorch/pull/30810

Mimic the `DCHECK` macro from https://github.com/pytorch/pytorch/blob/e5eb871/c10/util/logging_is_not_google_glog.h#L117-L125

With this change the perf gap is eliminated:

```
================================================================================
Program Output:
================================================================================
Run on (36 X 1601 MHz CPU s)
2019-12-12 20:12:13
-----------------------------------------------------------------
Benchmark                          Time           CPU Iterations
-----------------------------------------------------------------
BM_IntrusivePtrCtorDtor           23 ns         23 ns   30914703
BM_SharedPtrCtorDtor              27 ns         27 ns   25895944
BM_IntrusivePtrArray/16          503 ns        503 ns    1392139
BM_IntrusivePtrArray/32         1006 ns       1006 ns     695749
BM_IntrusivePtrArray/64         2013 ns       2013 ns     347714
BM_IntrusivePtrArray/128        4024 ns       4024 ns     173964
BM_IntrusivePtrArray/256        8047 ns       8047 ns      86994
BM_IntrusivePtrArray/512       16106 ns      16106 ns      43461
BM_IntrusivePtrArray/1024      32208 ns      32207 ns      21731
BM_IntrusivePtrArray/2048      64431 ns      64430 ns      10865
BM_IntrusivePtrArray/4096     128940 ns     128938 ns       5429
BM_SharedPtrArray/16             503 ns        503 ns    1392128
BM_SharedPtrArray/32            1006 ns       1006 ns     695940
BM_SharedPtrArray/64            2012 ns       2012 ns     347817
BM_SharedPtrArray/128           4024 ns       4023 ns     173927
BM_SharedPtrArray/256           8069 ns       8069 ns      86741
BM_SharedPtrArray/512          16143 ns      16142 ns      43357
BM_SharedPtrArray/1024         32283 ns      32283 ns      21685
BM_SharedPtrArray/2048         64718 ns      64717 ns      10817
BM_SharedPtrArray/4096        129469 ns     129466 ns       5407
================================================================================
```
```
================================================================================
Program Output:
================================================================================
Run on (80 X 2001 MHz CPU s)
2019-12-12 20:12:23
-----------------------------------------------------------------
Benchmark                          Time           CPU Iterations
-----------------------------------------------------------------
BM_IntrusivePtrCtorDtor           18 ns         18 ns   38630411
BM_SharedPtrCtorDtor              22 ns         22 ns   32356114
BM_IntrusivePtrArray/16          402 ns        402 ns    1739637
BM_IntrusivePtrArray/32          805 ns        805 ns     869818
BM_IntrusivePtrArray/64         1610 ns       1609 ns     434881
BM_IntrusivePtrArray/128        3218 ns       3218 ns     217437
BM_IntrusivePtrArray/256        6436 ns       6436 ns     108739
BM_IntrusivePtrArray/512       12882 ns      12882 ns      54356
BM_IntrusivePtrArray/1024      25763 ns      25763 ns      27177
BM_IntrusivePtrArray/2048      51532 ns      51531 ns      13590
BM_IntrusivePtrArray/4096     103091 ns     103091 ns       6778
BM_SharedPtrArray/16             402 ns        402 ns    1740165
BM_SharedPtrArray/32             804 ns        804 ns     869035
BM_SharedPtrArray/64            1610 ns       1610 ns     434975
BM_SharedPtrArray/128           3218 ns       3218 ns     217505
BM_SharedPtrArray/256           6457 ns       6457 ns     108510
BM_SharedPtrArray/512          12909 ns      12909 ns      54249
BM_SharedPtrArray/1024         25810 ns      25810 ns      27127
BM_SharedPtrArray/2048         51763 ns      51763 ns      13531
BM_SharedPtrArray/4096        103506 ns     103505 ns       6759
================================================================================
```

Test Plan:
buck test caffe2/c10/...
buck test mode/opt caffe2/c10/...

Differential Revision: D18998243

fbshipit-source-id: ddf0a118a80efe032b52d403867c1f416c721590
2019-12-18 21:55:58 -08:00
Sebastian Messmer
643ca5def2 Replace c10::guts::stuff with std::stuff (#30915)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30915

Since we now have C++14, we don't need these c10::guts helpers anymore
ghstack-source-id: 95777609

Test Plan: waitforsandcastle

Differential Revision: D18869639

fbshipit-source-id: 97716f932297c64c6e814410ac47b444c33d4e2e
2019-12-16 13:57:19 -08:00
Sebastian Messmer
2950530031 caffe2::TypeMeta uses compile time type names (#26619)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26619

ghstack-source-id: 95348564

Test Plan: unit tests

Differential Revision: D17519252

fbshipit-source-id: 337ec76d17172dd1af60a1676d69964a41dcb7a1
2019-12-14 20:29:16 -08:00
Sebastian Messmer
6e1e09fd10 Compile time type names (#26618)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26618

Implement a mechanism to get type names at compile time
In a future diff, I'm planning to introduce this to caffe2::TypeMeta and a few other places.
ghstack-source-id: 95337871

Test Plan: unit tests

Differential Revision: D17519253

fbshipit-source-id: e14017f962fd181d147accb3f53fa8d6ee42a3f8
2019-12-14 20:29:11 -08:00
Qi Zhou
44ff7b08d8 Reduce intrusive_ptr incref/decref costs (#30709)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30709

Intrusive_ptr doesn't provide a explicit incref method. When a users want to
incref the target, they creates a intrusive_ptr to wrap the target, then makes
a copy which does the actual incref, then release both the first intrusive_ptr
and the copy to prevent decref at deconstruction time. This is very
inefficient. Instead, do the incref/decref directly.

Differential Revision: D18798505

fbshipit-source-id: 524d4f30d07d733df09d54423b044d80e4651454
2019-12-06 11:52:20 -08:00
Brian Wignall
e7fe64f6a6 Fix typos (#30606)
Summary:
Should be non-semantic.

Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30606

Differential Revision: D18763028

Pulled By: mrshenli

fbshipit-source-id: 896515a2156d062653408852e6c04b429fc5955c
2019-12-02 20:17:42 -08:00
Sebastian Messmer
70e9ef518f c10::string_view (#26616)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26616

Implement C++17 std::string_view for C++11.

This is useful for compile time type name retrievaly which I'm going to stack on top of this.
It is also useful to replace `const std::string&` with throughout our codebase.
ghstack-source-id: 92100314

Test Plan: unit tests

Differential Revision: D17518992

fbshipit-source-id: 48e31c677d51b0041f4b37e89a92bd176d4a0b08
2019-10-21 16:10:40 -07:00
Sebastian Messmer
d9de2e0ba9 Back out "Revert D17936166: [wip] Constexpr type ids" (#28155)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28155

Original commit changeset: 92c63a96dedd
ghstack-source-id: 92051874

Test Plan: unit tests

Differential Revision: D17964410

fbshipit-source-id: 1d989d28b3e1de6d43c915f122f2b65a77a332eb
2019-10-16 18:24:04 -07:00
Lu Fang
1819fade35 Revert D17936166: [wip] Constexpr type ids
Test Plan: revert-hammer

Differential Revision:
D17936166

Original commit changeset: 68cfa926c721

fbshipit-source-id: 92c63a96dedd8764e342c6437c6ea308d93d29b2
2019-10-16 06:47:10 -07:00
Sebastian Messmer
9cc4405dc9 Constexpr type ids (#28023)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28023

ghstack-source-id: 91987335

Test Plan: waitforsandcastle

Differential Revision: D17936166

fbshipit-source-id: 68cfa926c721e5fbc96e083eb47e784bf34a9df4
2019-10-15 21:21:20 -07:00
Sebastian Messmer
ef8bcfe2c7 Revert D17488861: constexpr type ids
Test Plan: revert-hammer

Differential Revision:
D17488861

Original commit changeset: ce7b059d7c86

fbshipit-source-id: 426fca9abe7122190fc17ac6976bc6bcbd5718df
2019-10-15 09:59:21 -07:00
Sebastian Messmer
6f865c1e37 constexpr type ids (#26502)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26502

Create type ids at compile time instead of incrementing a counter at runtime. This is done by computing a compile time crc64 on the type name. We couldn't do this before, because we still used GCC4 and that compiler didn't support the use of `__PRETTY_FUNCTION__` in a constexpr context. However, since GCC5 this is possible and we can use this trick.

This does not change the semantics of preallocated type ids. I actually think we don't need to preallocate anymore, but I split the removal of preallocation into a separate diff to be able to test it separately.

ghstack-source-id: 91896920

Test Plan: unit tests

Differential Revision: D17488861

fbshipit-source-id: ce7b059d7c8686b69cb091a4a8beaf4b96391343
2019-10-15 08:47:09 -07:00
Elias Ellison
921079c5c2 flat hash map that preserves insertion and deletion order (#25675)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25675

This will be used to support OrderedDict in python. Modifies the existing `flat_hash_map` to preserve insertion and deletion order.

Test Plan: Imported from OSS

Differential Revision: D17440131

Pulled By: eellison

fbshipit-source-id: c7a6a290c8471627f5a061c0cca8e98ff131c9b4
2019-09-18 22:36:31 -07:00
Edward Yang
2080a15860 Add VariableTensorId, store it in TensorTypeSet (#25597)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25597

We now take advantage of the new bitset representation TensorTypeSet to store "Variable-ness" of a tensor directly in the dispatch key. We introduce a new thread local TensorTypeSet "excluded" and replace the previous thread local boolean with it; we no longer have to query `is_variable()` to do dispatch (I didn't delete `is_variable`, because there are still a lot of uses of it). The key change is in `dispatchTypeId`.

Knock-on effects:
* Because Variable is now a TensorTypeId, I can eliminate the out-of-line registration `registerVariableOp` for variables; instead, make the registrar take a TensorTypeId (instead of a Backend) and you just register under the Variable key.
* Tensors aren't really ever created with Variable information initialized correctly at the start; instead, a tensor "becomes" a Variable because we set its `autograd_meta_`. These setters now correctly setup invariants on the dispatch type set. The new invariant is that if `autograd_meta_ != nullptr`, then `type_set().has(TensorTypeId::VariableTensorId)`.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D17265919

Pulled By: ezyang

fbshipit-source-id: a90a7ed14f5cb1086137483ae3d0646fcd4c42d0
2019-09-11 08:59:48 -07:00
Edward Yang
aa49aa856c Tensor type set (#25308)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25308

Instead of storing a single TensorTypeId in a Tensor, we store a bitset of tensor type IDs in a Tensor, TensorTypeSet. This class comes with some unit tests.  This is in preparation for making Variable a TensorTypeId. In order to help flush out places where this makes a semantic difference, we rename `Tensor::type_id()` to `Tensor::type_set()` and smoke out all of the locations where this was semantically meaningful.

Because the new tensor type set is 64-bits, this increases the size of Tensor by a word.

Listing of semantic changes:
* Many TensorImpl related constructors just propagate TensorTypeId to a parent constructor. These are pretty simple to adjust.
  * Backend extensions are now in the business of explicitly constructing a TensorTypeSet and then passing it in. This is probably OK for now but when Variable drops, these dispatch IDs may get immediately overwritten to have Variable set.
* `sparseTensorSetToDeviceType` and similar functions previously did an equality test with TensorTypeId, to determine what an appropriate device type is. This equality is now replaced with a set inclusion test. This is valid, under the assumption that we don't ever have weird sets like "this tensor is simultaneously a sparse CPU tensor and a sparse CUDA tensor", which will be true in the short term plan of adding Variable to the dispatch ID.
* `impl::dispatchTypeId` was generally introduced for cases where we legitimately need to convert from `TensorTypeSet -> TensorTypeId` in a dispatch related manner. At the moment, the implementation is trivial, but they will soon be adjusted to handle TLS. I've tried to make these call sites as forwards compatible as possible:
  * `checked_tensor_unwrap` and co now use `dispatchTypeId`. When Variable is added to the type set, these will always be called in a context where the Variable type ID is disabled, so we will get the correct underlying tensor type ID.
  * Uses of `Backend` in dispatch are now replaced with `TensorTypeSet`. The general heuristic here for whether or not to accept a `TensorTypeId` or `TensorTypeSet` is that we want to make the generated code as simple as possible. It is easier to retrieve a `TensorTypeSet`, so that's a more appropriate API in these cases.
* In some cases, I could not conveniently switch an implementation to the new semantics, because it was blocked on some other refactor. In this case, I introduced `legacyExtractTypeId`, which gives what would be a BC-compatible `TensorTypeSet` to `TensorTypeId` implementation that will continue to report the same values it would have prior to this change. This is **different** from `dispatchTypeId`, because this function does NOT respect TLS; it always ignores Variable type IDs.
  * c10 dispatcher tests, which are oblivious to Variable dispatch, use this BC function (actually, they use `extractTypeId`, an overload for Tensor.
  * The implementation of `new_*` methods heavily relies on tensor type ID, I chose not to unwind this. PR to refactor this at https://github.com/pytorch/pytorch/pull/25475
  * Slicing also relies on tensor type ID, see `torch/csrc/autograd/python_variable_indexing.cpp` (though in some cases in this file, I was able to replace use of tensor type ID with TensorOptions)
* In some cases, there is an equality test on tensor type ID which would be better done by testing "tensor axes". In those cases, I replaced those equality tests with more equality tests.
  * Example: `torch/csrc/nn/type_checks.h`
  * There is a total punt in `torch/csrc/tensor/python_tensor.cpp` where "instance of" checking is done via dispatch ids. In general, the Variable-ness of a tensor doesn't participate in instanceof testing. It's not entirely clear what to do here.
  * Instead of storing `Backend` in `VariableInfo`, we now just store Layout.

c10 dispatcher test updates were done with:

```
:%s/\([^ ]\+\)\.type_id()/extractTypeId(\1)/g
:%s/\([^( ]\+\)->type_id()/extractTypeId(*\1)/g
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/25308

Differential Revision: D17092791

Test Plan: sandcastle and ossci

Reviewed By: bwasti

Pulled By: ezyang

fbshipit-source-id: 22207d14fe62dd31ee19cc5011af22e3d9aabb5b
2019-09-10 10:30:54 -07:00
hongzhen
378881e903 Enable log_softmax and CrossEntropyLoss for bfloat16 (#24457)
Summary:
Enabled torch.nn.functional.log_softmax and torch.nn.CrossEntropyLoss for bfloat16 data type.
In order to do that, following dependency have to be enabled.
- RNE (round to nearest even)
- AccumulateType
- bfloat16 arithmetic operator overload

Also, we implement std::numeric_limits fully support for bfloat16 data type

background for dependency:
- RNE vs truncate
From torch.nn.CrossEntropyLoss test. input_size=(128, 1000)
RNE result:
float    output:  tensor(7.3981, dtype=torch.float32, grad_fn=<NllLossBackward>)
bfloat16 output:  tensor(7.3125, dtype=torch.bfloat16, grad_fn=<NllLossBackward>)
truncate result:
float    output:  tensor(7.3981, dtype=torch.float32, grad_fn=<NllLossBackward>)
bfloat16 output:  tensor(5.8750, dtype=torch.bfloat16, grad_fn=<NllLossBackward>)

- scalar_t vs AccumulateType (AccumulateType of bfloat16 is float)
AccumulateType is essential to keep accuracy, especially for reduction related operation.
we have verified it with both local case and real topology. It turns out that bfloat16 type accumulator would cause huge relative error when elements number is large, even more than 50%.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24457

Differential Revision: D17113018

Pulled By: ezyang

fbshipit-source-id: 8d61297ca118f9b5c6730a01efcf3a3704d2f206
2019-09-09 09:19:47 -07:00
Edward Yang
58a0dee749 Replace open registration TensorTypeId with closed enum. (#25252)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25252

Our model going forward for extensions will be that you will have to
get an allocation of an ID in our system.  This is how things work
in practice today; we're just simplifying our underlying registration
since there is no need to have distributed registration.

There are some codemods in this diff:

```
codemod --extensions cpp,h,cc,cuh,py,in --exclude-paths=c10/core/TensorTypeId.h '([A-Za-z]+?)TensorId\(\)' 'TensorTypeId::\1TensorId'
codemod --extensions cpp,h,cc,cuh,py,in 'TensorTypeIds::undefined\(\)' 'TensorTypeId::UndefinedTensorId'
codemod --extensions cpp 'TensorType1\(\)' 'TensorTypeId::CPUTensorId'
codemod --extensions cpp 'TensorType2\(\)' 'TensorTypeId::CUDATensorId'
codemod --extensions cpp 'TensorType3\(\)' 'TensorTypeId::XLATensorId'
codemod --extensions cpp 'TensorType1' 'CPUTensorId'
codemod --extensions cpp 'TensorType2' 'CUDATensorId'
codemod --extensions cpp 'TensorType3' 'XLATensorId'
```

The main hand-written changes are in c10/core/TensorTypeId.h

Other manual fixes:

- aten/src/ATen/core/op_registration/op_registration.cpp - stop using
  std::string operator+
- aten/src/ATen/function_wrapper.py - handle a hardcoded TypeId() that
  wasn't caught by codemod
- torch/csrc/tensor/python_tensor.h - fix now incorrect forward declaration
  of TensorTypeId
- aten/src/ATen/core/op_registration/ - remove out-of-line registration

Differential Revision: D17072001

Test Plan: ossci and sandcastle

Pulled By: ezyang

fbshipit-source-id: c641515fd0604c045c54fbb1d6b1b950f45e89d1
2019-08-29 08:55:58 -07:00
Iurii Zdebskyi
5b9f55f33f Enable Add, sub, mul, and div on CPU for bfloat16 type. (#22851)
Summary:
Enable Add, sub, mul, and div on CPU for bfloat16 type.
Tested via unit tests.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22851

Differential Revision: D16256757

Pulled By: izdeby

fbshipit-source-id: 8b62f7581fc0ca0d2cff48ab40d877a9fcf70a5b
2019-08-08 12:34:25 -07:00
Iurii Zdebskyi
3a8d7463bd Enabled BFloat16 storage (#21523)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21523
ghimport-source-id: 698b3cbd6b21c09b9ff8bf8011980df8e35c33b0

Test Plan: Imported from OSS

Differential Revision: D15819368

Pulled By: izdeby

fbshipit-source-id: f6b3bba7b3ca8ee677bd80a231dbb3920c07d61c
2019-07-09 21:51:06 -07:00
Iurii Zdebskyi
e72b617eb5 Intoducing bfloat16 type (#21522)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21522
ghimport-source-id: 4803f197ec04938501fdb10c1741280331c349d2

Test Plan: Imported from OSS

Differential Revision: D15819369

Pulled By: izdeby

fbshipit-source-id: 46408dc316a5c4dc644a736dc42da2422b34bcb9
2019-07-09 21:14:10 -07:00
Sebastian Messmer
d1c80300ce Better stringification of dispatch keys in error messages (#21809)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21809

Many error messages show dispatch keys, for example when the dispatcher didn't find a kernel to dispatch to.
Previously, this was a string like "CPU" or "CUDA" for known backends and just an arbitrary number for other backends.

Now, tensor type id registration also registers a name for the dispatch key and shows that in the error messages.

There is no API change, just the error messages are better now.

Differential Revision: D15835809

fbshipit-source-id: 4f0c9d0925c6708b02d79c653a2fae75b6623bb9
2019-06-19 11:44:24 -07:00