Commit Graph

67 Commits

Author SHA1 Message Date
Peter Goldsborough
7a61306031 Enable all clang-tidy performance checks (#15198)
Summary:
This PR adds the final set of clang-tidy checks we should add for our codebase: a last set of performance-related checks. Most fixes here are around changing `auto` to `const auto&` in a few places where unnecessary copies were made, and adding `reserve()` calls before loops doing repeated `push_back()`. Also a few cases of calling `std::string::find` with a single-character string literal instead of a single char, which uses a less efficient string search algorithm meant for searching larger substrings.

![image](https://user-images.githubusercontent.com/6429851/49978940-adc1a780-ff01-11e8-99da-a4e431361f07.png)

ezyang apaszke
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15198

Differential Revision: D13468797

Pulled By: goldsborough

fbshipit-source-id: 2bed1ea1c7c162b7f3e0e1026f17125e88c4d5b2
2018-12-14 13:32:47 -08:00
Peter Goldsborough
1e9c384afb Enable performance-unnecessary-value-param in .clang-tidy (#15026)
Summary:
This PR fixes around 250 places in the codebase where we were making unnecessary copies of objects (some large, some small).

ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15026

Differential Revision: D13458784

Pulled By: goldsborough

fbshipit-source-id: be5148b2ce09493588d70952e6f6d6ff5ec5199b
2018-12-13 16:15:35 -08:00
Edward Yang
517c7c9861 Canonicalize all includes in PyTorch. (#14849)
Summary:
Anywhere we used #include "foo.h", we now say #include <foo.h>
Paths are adjusted to be rooted out of aten/src, torch/lib, or
the root level directory.

I modified CMakeLists.txt by hand to remove TH and THC from
the include paths.

I used the following script to do the canonicalization:

```
  import subprocess
  import re
  import os.path

  files = subprocess.check_output(['git', 'ls-files']).decode('utf-8').rstrip().split('\n')
  for fn in files:
      if not any(fn.endswith(suff) for suff in ['.cu', '.cpp', '.in', '.h', '.hpp', '.cu', '.cuh', '.cc']):
          continue
      if not any(fn.startswith(pref) for pref in ["aten/", "torch/"]):
          continue
      with open(fn, 'r') as f:
          c = f.read()
      def fmt(p):
          return "#include <{}>".format(p)
      def repl(m):
          p = m.group(1)
          if p in ["dlfcn.h", "unistd.h", "nvrtc.h", "cuda.h", "cuda_runtime.h", "cstdint", "cudnn.h", "Python.h", "cusparse.h", "cuda_runtime_api.h", "cuda_fp16.h", "cublas_v2.h", "stdint.h", "curand_kernel.h"]:
              return fmt(p)
          if any(p.startswith(pref) for pref in ["torch/csrc", "c10/", "ATen/", "caffe2/", "TH/", "THC/", "Eigen/", "gtest/", "zdl/", "gloo/", "onnx/", "miopen/"]):
              return fmt(p)
          for root in ["aten/src", "torch/lib", ""]:
              for bad_root in [os.path.dirname(fn), "aten/src/TH", "aten/src/THC", "torch/csrc"]:
                  new_p = os.path.relpath(os.path.join(bad_root, p), root)
                  if not new_p.startswith("../") and (os.path.exists(os.path.join(root, new_p)) or os.path.exists(os.path.join(root, new_p + ".in"))):
                      return fmt(new_p)
          print("ERROR: ", fn, p)
          return m.group(0)
      new_c = re.sub(r'#include "([^"]+)"', repl, c)
      if new_c != c:
          print(fn)
          with open(fn, 'w') as f:
              f.write(new_c)
```

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14849

Reviewed By: dzhulgakov

Differential Revision: D13363445

Pulled By: ezyang

fbshipit-source-id: 52361f878a672785f9306c9e9ab2513128092b68
2018-12-08 19:38:30 -08:00
Thomas Viehmann
2d56df7892 Use .to to convert new tensors in new_tensor (#14097)
Summary:
This would solve the tracing problems of #13969.
Fixes: #14732

I would appreciate if this got good scrutiny before applied.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14097

Differential Revision: D13323181

Pulled By: ezyang

fbshipit-source-id: dcd104b497c0bfddb751923c6166a3824b7a3702
2018-12-04 14:03:56 -08:00
Edward Yang
6fe1867c23 Expunge direct device index handling from tensor_conversion_dispatch (#14421)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14421

Last time I looked this, I bailed because it seemed like there were
a lot of sites to fix.  Well, I need this to work properly for out-of-place
HIPify, so I took another whack at it.  Changes should be pretty self-explanatory.

Reviewed By: gchanan

Differential Revision: D13221302

fbshipit-source-id: ed21e2668a1a629898a47358baf368fe680263a0
2018-11-29 16:04:10 -08:00
Edward Yang
e35418b3be New implementations of DeviceGuard, StreamGuard and MultiStreamGuard (with CUDA specializations) (#13342)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13342

This PR introduces a few new concepts:

- DeviceGuardImplInterface, and implementations for CPU and CUDA, which
  provide a generic interface for interfacing with device and stream state,
  without requiring a direct dependency on the code in question.
- InlineDeviceGuard, a general template for generating both specialized
  and dynamically dispatched device guard implementations.  Dynamic
  dispatch is done by specializing it on a VirtualGuardImpl.
- Provide a device-independent DeviceGuard class, which can be used even
  from CPU code. It uses the aforementioned dynamic dispatch.
- CUDA-specialized CUDAGuard class, which doesn't have a dynamic dispatch
  but can only be used from CUDA.
- StreamGuard, which is the same as above, but for streams rather than
  devices.
- Optional variants of all the aforementioned guards, which are a no-op if
  no device/stream is specified
- CUDAMultiStreamGuard, specifically for the case when we want to set
  a device on every guard.

There are some subtle semantic changes, which have been thoroughly documented
in the class definition.

BC-breaking changes:

- Move constructor/assignment have been removed from all device guard
  implementations.
- In some cases where you previously wrote 'set_device' (or 'set_stream'), you now must write
  'reset_device', because if you switch devices/device types, the stream/device on the
  previous device is unset.  This is different from previous behavior.
- CUDAGuard no longer handles streams, or multiple streams.  Use CUDAStreamGuard
  or CUDAMultiStreamGuard as appropriate for your use case.

Reviewed By: dzhulgakov

Differential Revision: D12849620

fbshipit-source-id: f61956256f0b12be754b3234fcc73c2abc1be04e
2018-11-11 12:11:10 -08:00
Edward Yang
0aaff5eaf9 Replace CUDA-specific set_index(_from) method from DeviceGuard with set_device. (#13275)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13275

This resulted in a bunch of knock-on changes, which I will now
describe:

- s/original_index/original_device/
- s/last_index/last_device/
- A bunch of places that used set_index, now use CUDAGuard (which does have
  set_index) because they were CUDA-specific code.

Major caveat: DeviceGuard doesn't *actually* work non-CUDA/CPU devices, To make
that happen, I plan on totally replacing the implementation of DeviceGuard; what
I mostly care about here is wrangling the API into an acceptable state.

Reviewed By: gchanan

Differential Revision: D12832080

fbshipit-source-id: 7de068c7cec35663dc8a533026a626331336e61d
2018-10-31 07:55:13 -07:00
Tongzhou Wang
46162ccdb9 Autograd indices/values and sparse_coo ctor (#13001)
Summary:
Reopen of #11253 after fixing bug in index_select
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13001

Differential Revision: D10514987

Pulled By: SsnL

fbshipit-source-id: 399a83a1d3246877a3523baf99aaf1ce8066f33f
2018-10-24 10:00:22 -07:00
Yangqing Jia
08aab4dfdd remove ATen/Error.h and ATen/core/Error.h (#12792)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12792

This is a follow up diff after D10238910.

Only non-codemod change is the removal of ATen/Error.h and ATen/core/Error.h. Other files are basically changing the inclusion path + clang format for inclusion order.

Reviewed By: bddppq

Differential Revision: D10437824

fbshipit-source-id: 7f885f80ab5827468d1351cfb2765d0e3f555a69
2018-10-17 17:25:42 -07:00
Yangqing Jia
713e706618 Move exception to C10 (#12354)
Summary:
There are still a few work to be done:

- Move logging and unify AT_WARN with LOG(ERROR).
- A few header files are still being plumbed through, need cleaning.
- caffe2::EnforceNotMet aliasing is not done yet.
- need to unify the macros. See c10/util/Exception.h

This is mainly a codemod and not causing functional changes. If you find your job failing and trace back to this diff, usually it can be fixed by the following approaches:

(1) add //caffe2/c10:c10 to your dependency (or transitive dependency).
(2) change objects such as at::Error, at::Optional to the c10 namespace.
(3) change functions to the c10 namespace. Especially, caffe2::MakeString is not overridden by the unified c10::str function. Nothing else changes.

Please kindly consider not reverting this diff - it involves multiple rounds of rebasing and the fix is usually simple. Contact jiayq@ or AI Platform Dev for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/12354

Reviewed By: orionr

Differential Revision: D10238910

Pulled By: Yangqing

fbshipit-source-id: 7794d5bf2797ab0ca6ebaccaa2f7ebbd50ff8f32
2018-10-15 13:33:18 -07:00
Gregory Chanan
695465915a Remove some Type.tensor usages and remove native_tensor without size. (#12403)
Summary:
Same as before, but with "initialTensorOptions()" instead of "TensorOptions(false)".
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12403

Differential Revision: D10225427

Pulled By: gchanan

fbshipit-source-id: 60bd025a5cc15bdbbab6eafc91ea55f5f2c3117e
2018-10-05 20:55:14 -07:00
Gregory Chanan
e2d2b270db Revert D10212616: [pytorch][PR] Remove some Type.tensor usages and remove native_tensor without size.
Differential Revision:
D10212616

Original commit changeset: c9cd128d1111

fbshipit-source-id: 923781ba9cd6e60e7c92789832e5601a1fd848b5
2018-10-05 11:55:45 -07:00
Gregory Chanan
705d80b51e Remove some Type.tensor usages and remove native_tensor without size. (#12355)
Summary:
This is to move us along the path to removing Type from the public API.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12355

Reviewed By: ezyang

Differential Revision: D10212616

Pulled By: gchanan

fbshipit-source-id: c9cd128d1111ab219cb0b2f3bf5b632502ab97c0
2018-10-05 11:12:07 -07:00
Gregory Chanan
0947712e5d Move Factory functions from Type to TypeExtendedInterface. (#12025)
Summary:
This makes a few changes wrt Type, with the ultimate goal of removing Type from the public Methods/Functions.  In particular:
1) Removes factory functions from Type, into TypeExtendedInterface.
2) sparse_coo_tensor is now a first class at:: namespace function, with TensorOptions overloads.
3) We move from Type-based sparse_coo_tensor dispatch to function-based.

Note we still require a number of changes to get rid of tType in the public interface, in particular TensorOptions needs to support CUDA vs non-CUDA dispatch.  That is coming in a future patch.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12025

Reviewed By: ezyang

Differential Revision: D10017205

Pulled By: gchanan

fbshipit-source-id: 00807a37b09ed33f0656aaa165bb925abb026320
2018-09-25 09:40:17 -07:00
Gregory Chanan
1178851280 Get rid of most usages of Type.tensor. (#12002)
Summary:
1) Most usages are replaced by at::empty.
2) native_tensor has its namespace function removed
3) Type.tensor(sizes, strides) becomes at::empty_strided(sizes, strides).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12002

Differential Revision: D10007201

Pulled By: gchanan

fbshipit-source-id: 5e5647c050ed2ecb87a33e0b5ce4928fa3186c34
2018-09-24 10:16:18 -07:00
Wei Yang
817e83fc01 fix PR #11061 (#11815)
Summary:
- fix PR https://github.com/pytorch/pytorch/pull/11061 by moving `detach_()` and `set_requires_grad()` to `torch.tensor_ctor()` and `tensor.new_tensor`, and also removed warnings and `args_requires_grad` from `internal_new_from_data `
- with this patch, the returned tensor from `tensor_ctor()` and `new_tensor` will be detached from source tensor, and set requires_grad based on the input args
- `torch.as_tensor` retains its behavior as documented

gchanan apaszke
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11815

Differential Revision: D9932713

Pulled By: weiyangfb

fbshipit-source-id: 4290cbc57bd449954faadc597c24169a7b2d8259
2018-09-21 11:04:19 -07:00
Wei Yang
407a9fee0c make copy constructed tensor a leaf variable when using torch.tensor(sourceTensor) (#11061)
Summary:
- fix https://github.com/pytorch/pytorch/issues/10876
- the cause of the bug is because copy constructor cannot distinguish between default value of requires_grad and requires_grad=False, thus it makes a copy from source tensor along with its grad_fn if requires_grad=True at source
- with this fix, the behavior becomes
```
>>> source = torch.randn(2, 2, requires_grad=True)
>>> copy = torch.tensor(source, requires_grad=True)
>>> print(copy)
tensor([[-1.2001,  1.9869],
        [-1.0134,  1.3096]], grad_fn=<CopyBackwards>)

>>> source = torch.randn(2, 2, requires_grad=True)
>>> copy = torch.tensor(source, requires_grad=False)
>>> print(copy)
tensor([[-0.7402,  0.0467],
        [ 0.4344, -0.0420]])

>>> source = torch.randn(2, 2, requires_grad=True)
>>> copy = torch.tensor(source)
>>> print(copy)
tensor([[-0.7402,  0.0467],
        [ 0.4344, -0.0420]])
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11061

Differential Revision: D9569714

Pulled By: weiyangfb

fbshipit-source-id: ea368688bdc0f1ce5997870e164e42835b64b4a1
2018-09-17 23:29:09 -07:00
Gregory Chanan
a8b1755de6 Check device argument makes sense for legacy tensor constructors. (#11669)
Summary:
Fixes: https://github.com/pytorch/pytorch/issues/11427.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11669

Differential Revision: D9817881

Pulled By: gchanan

fbshipit-source-id: 77dc5b0e6bc9884d2616210b96c07e4734058bb6
2018-09-17 08:24:25 -07:00
Adam Paszke
3e665cc29b Improve support for tracing sizes, add more tracer warnings (#11288)
Summary:
Many constructors like `torch.zeros` or `torch.randn` didn't support
size tracing correctly which is fixed by this pass. Same issue has been
fixed in legacy tensor constructors.

Additionally, new tensor constructors, which do not participate in
tracing (most notably `torch.tensor`, `torch.as_tensor` and
`torch.from_numpy`) raise a warning when they are used.

Finally, entering a traceable operation disables the tracing in its body.
This is needed because

zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11288

Reviewed By: ezyang

Differential Revision: D9751183

Pulled By: apaszke

fbshipit-source-id: 51444a39d76a3e164adc396c432fd5ee3c8d5f7f
2018-09-10 15:22:48 -07:00
vishwakftw
733402bef4 Fix issues with certain heterogeneous types in lists during tensor creation (#11377)
Summary:
Closes #9963
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11377

Differential Revision: D9701824

Pulled By: soumith

fbshipit-source-id: 89c5448fd90ece1b365dc42f775b6b0c73ce790c
2018-09-07 12:56:35 -07:00
Edward Yang
56bdd87b40 Get rid of some uses of type() (#11215)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11215

I found these by deleting the implicit conversion of Type to
TensorOptions and then fixing sites.  This isn't a complete
refactor, because I ran out of steam after fixing this many
and decided to keep the implicit conversion.  Still, why
waste a perfectly good refactor?

Reviewed By: gchanan, cpuhrsch

Differential Revision: D9634750

fbshipit-source-id: 4d8fb778e13e6e24b888b1314a02709b2cb00b62
2018-09-04 20:26:22 -07:00
Edward Yang
0ff1bb0d8a Remove Type constructor from TensorOptions, add Type::options (#11189)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11189

Replaces it with an operator TensorOptions() method on
Type, reestablishing the implicit conversion.  I originally
wanted to get rid of the implicit conversion entirely, but
there were a *lot* of use-sites, so I added it back to avoid
a huge codemod.  In this patch, I only had to fix sites that
used the optional device_index API.

Reviewed By: cpuhrsch

Differential Revision: D9628281

fbshipit-source-id: 5fe2a68eefb77a3c9bb446f03a94ad723ef90210
2018-09-04 08:10:04 -07:00
Edward Yang
750ede7215 Rename getType to getVariableTypeFromBaseType / getVariableType (#11095)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11095

We used getType to mean a lot of things.

- getVariableTypeFromBaseType: given a base Type (non-Variable type)
  compute the Variable Type which corresponds to it.

- getVariableType: like at::getType, but return the Variable type
  rather than the plain type.

This rename makes it clearer at the use-site what things are what,
and will make a subsequent rename of at::getType easier.

Reviewed By: gchanan, cpuhrsch

Differential Revision: D9583630

fbshipit-source-id: 2667ec98e7607bc466920c7415a8c651fd56dfca
2018-08-30 20:11:25 -07:00
Peter Goldsborough
7ddc6f84c4 NULL -> nullptr (#11047)
Summary:
How did we get so many uses of `NULL` again?

ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11047

Differential Revision: D9566799

Pulled By: goldsborough

fbshipit-source-id: 83469f352ac69aa65bdaf1a1a21f922d892e0db3
2018-08-30 16:25:42 -07:00
Will Feng
b14f2e899c Preserve sparse tensor shape and dim invariants, and add scalar tensor support (#9279)
Summary:
When 0-sized dimension support is added, we expect an empty sparse tensor to be a 1-dimensional tensor of size `[0]`, with `sparseDims == 1` and `denseDims == 0`. Also, we expect the following invariants to be preserved at all times:

```
_sparseDims + _denseDims = len(shape)
_indices.shape: dimensionality: 2,  shape: (_sparseDims, nnz)
_values.shape:  dimensionality: 1 + _denseDims.  shape: (nnz, shape[_sparseDims:])
```

This PR fixes various places where the invariants are not strictly enforced when 0-sized dimension support is enabled.

Tested and `test_sparse.py` passes locally on both CPU and CUDA with the `USE_TH_SIZE_ZERO_DIM` flag.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9279

Differential Revision: D8936683

Pulled By: yf225

fbshipit-source-id: 12f5cd7f52233d3b26af6edc20b4cdee045bcb5e
2018-08-23 10:10:24 -07:00
Edward Yang
19031c68dc Use intrusive_ptr in Storage; replace unique_ptr<Storage> with Storage (#10488)
Summary:
```
Use intrusive_ptr in Storage; replace unique_ptr<Storage> with Storage

This patch does two major changes:

- It replaces the use of Retainable in Storage with a new implementation
  based on intrusive_ptr.  This will be necessary because Caffe2 will
  be using this class to implement intrusive_ptrs, and we need to
  line these up for the merge.  One good thing about the new implementation is
  that the default copy/move constructors/assignment operators and destructor
  work automatically, instead of needing to be hardcoded into Storage/Tensor.

- It replaces all places where we returned std::unique_ptr<Storage> with
  Storage, collapsing an unnecessary double indirection that is no longer
  necessary now that we have correctly working copy/move constructors.

I didn't initially want to do step (2), but it was very important to
eliminate all bare uses of new Storage and new StorageImpl, and this making
the API change was the most straightforward way to do this.

HOW TO FIX YOUR CODE IN THE NEW API

- You no longer need to dereference the result of tensor.storage() to pass
  it to set.  So, instead of:

      x.set_(*y.storage());

  just write:

      x.set_(y.storage());

- If you were accessing methods on StorageImpl via the pImpl() method, you
  must use the dot operator to run pImpl().  Even better; just drop pImpl,
  we now have method forwarding.  So, instead of:

      storage->pImpl()->data();

  just do:

      storage->data();
      // storage.pImpl()->data() works too but is not as recommended

- storage->getDevice() is no more; instead use storage->device().index()

MISC CODE UPDATES

- retain, release, weak_retain, weak_release and weak_lock are now
  reimplemented using the "blessed API", and renamed to make it
  clearer that their use is discouraged.

- nvcc OS X and general OS X portability improvements to intrusive_ptr

- A new comment in intrusive_ptr describing how stack allocated
  intrusive_ptr_targets work differently than heap allocated ones
  from c10::make_intrusive

CAVEAT EMPTOR

- THStorage_weakRetain used to work on strong pointers, but it NO LONGER
  works with intrusive_ptr.  You must reclaim the strong pointer into a
  real strong pointer, construct a weak pointer from it, and then release
  the strong and weak pointers.  See StorageSharing.cpp for an example.
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10488

Reviewed By: gchanan

Differential Revision: D9306134

Pulled By: ezyang

fbshipit-source-id: 02d58ef62dab8e4da6131e1a24834a65c21048e2
2018-08-21 21:39:55 -07:00
Edward Yang
6bdbad93b9 Refactor Device to not depend on Backend. (#10478)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10478

- Removed Backend constructor from Device, and fixed all
  use-sites to use DeviceType::CPU instead of kCPU, or
  use a new function backendToDeviceType to perform
  the conversion.
- New method device_type() on Type; it gives you the
  underlying device type, e.g., CPU for SparseCPU.
- We add backward compatibility for kCPU/kCUDA uses,
  by introducing a new special type which is implicitly
  convertible to both DeviceType and Backend.  As long as
  you don't define a function that's overloaded on both
  DeviceType and Backend (but not on BackendOrDeviceType),
  the implicit conversions will ensure that uses
  of at::Device(at::kCPU) keep working. We fixed use-sites in
  the library, but did NOT fix sites in the test code, so that
  we can exercise this BC code.

Reviewed By: Yangqing

Differential Revision: D9301861

fbshipit-source-id: 9a9d88620500715c7b37e655b4fd761f6dd72716
2018-08-18 17:39:14 -07:00
Sebastian Messmer
f51f15bb27 Update include paths for ATen/core (#10130)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10130

Update some include paths to make them internally consistent

Reviewed By: ezyang

Differential Revision: D9119906

fbshipit-source-id: b44e5cab8e8e795ee18afe9ffc6caf1f2b413467
2018-08-03 11:57:02 -07:00
vishwakftw
ea3c36b822 NumPy Scalar to PyTorch Scalar (#9225)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/4985 .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9225

Differential Revision: D8769317

Pulled By: ezyang

fbshipit-source-id: eeaeaf0749c9dc9e372634da68b4bd23e6e3ad28
2018-07-30 14:39:40 -07:00
Peter Goldsborough
47492ed451
[C++ API] Bag of fixes (#8843)
* Bag of fixes

* Rename tensor_range.h to tensor_list_view.h

* Post rebase fixes

* Rename torch::tensor namespace to torch::tensors due to name conflict

* Avoid recursion in Module::to
2018-06-25 21:11:49 -07:00
Peter Goldsborough
372d1d6735
Create ATen tensors via TensorOptions (#7869)
* Created TensorOptions

Storing the type in TensorOptions to solve the Variable problem

Created convenience creation functions for TensorOptions and added tests

Converted zeros to TensorOptions

Converted rand to TensorOptions

Fix codegen for TensorOptions and multiple arguments

Put TensorOptions convenience functions into torch namespace too

All factory functions except *_like support TensorOptions

Integrated with recent JIT changes

Support *_like functions

Fix in place modification

Some cleanups and fixes

Support sparse_coo_tensor

Fix bug in Type.cpp

Fix .empty calls in C++ API

Fix bug in Type.cpp

Trying to fix device placement

Make AutoGPU CPU compatible

Remove some auto_gpu.h uses

Fixing some headers

Fix some remaining CUDA/AutoGPU issues

Fix some AutoGPU uses

Fixes to dispatch_tensor_conversion

Reset version of new variables to zero

Implemented parsing device strings

Random fixes to tests

Self review cleanups

flake8

Undo changes to variable.{h,cpp} because they fail on gcc7.2

Add [cuda] tag to tensor_options_cuda.cpp

Move AutoGPU::set_index_from into .cpp file because Windows is stupid and sucks

Fix linker error in AutoGPU.cpp

Fix bad merge conflict in native_functions.yaml

Fixed caffe2/contrib/aten

Fix new window functions added to TensorFactories.cpp

* Removed torch::TensorOptions

Added code to generate wrapper functions for factory methods

Add implicit constructor from Backend to TensorOptions

Remove Var() from C++ API and use torch:: functions

Use torch:: functions more subtly in C++ API

Make AutoGPU::set_device more exception safe

Check status directly in DynamicCUDAHooksInterface

Rename AutoGPU to DeviceGuard

Removed set_requires_grad from python_variables.h and warn appropriately in Variable::set_requires_grad

remove python_default_init: self.type()

Add back original factory functions, but with deprecation warnings

Disable DeviceGuard for a couple functions in ATen

Remove print statement

Fix DeviceGuard construction from undefined tensor

Fixing CUDA device compiler issues

Moved as many methods as possible into header files

Dont generate python functions for deprecated factories

Remove merge conflict artefact

Fix tensor_options_cuda.cpp

Fix set_requires_grad not being checked

Fix tensor_new.h

TEMPORARILY put some methods in .cpp files to see if it solves issues on windows and mac

Fix bug in DeviceGuard.h

Missing includes

TEMPORARILY moving a few more methods into .cpp to see if it fixes windows

Fixing linker errors

* Fix up SummaryOps to use new factories

Undo device agnostic behavior of DeviceGuard

Use -1 instead of optional for default device index

Also move DeviceGuard methods into header

Fixes around device index after optional -> int32_t switch

Fix use of DeviceGuard in new_with_tensor_copy

Fix tensor_options.cpp

* Fix Type::copy(

* Remove test_non_float_params from ONNX tests

* Set requires_grad=False in ONNX tests that use ints

* Put layout/dtype/device on Tensor

* Post merge fixes

* Change behavior of DeviceGuard to match AutoGPU

* Fix C++ API integration tests

* Fix flip functions
2018-06-16 00:40:35 -07:00
Soumith Chintala
dc186cc9fe
Remove NO_* and WITH_* across codebase, except in setup.py (#8555)
* remove legacy options from CMakeLists

* codemod WITH_ to USE_ for WITH_CUDA, WITH_CUDNN, WITH_DISTRIBUTED, WITH_DISTRIBUTED_MW, WITH_GLOO_IBVERBS, WITH_NCCL, WITH_ROCM, WITH_NUMPY

* cover SYSTEM_NCCL, MKLDNN, NNPACK, C10D, NINJA

* removed NO_* variables and hotpatch them only in setup.py

* fix lint
2018-06-15 12:29:48 -04:00
gchanan
7abdc303c6
Don't allow requires_grad to be set on integer Tensor constructors in… (#7185)
* Don't allow requires_grad to be set on integer Tensor constructors in tensor_new.

* Fix autograd test.

* Fix test_distributions.

* Fix test_jit.

* Fix NN tests.
2018-05-18 19:45:10 +02:00
Seth Hendrickson
32b23a4bfc Throw error on tensor creation when sequence shape cannot be determined (#7583)
* first commit

* unit test

* minor style edits
2018-05-18 19:14:42 +02:00
gchanan
8031da5479
Implement torch.as_tensor, similar to numpy.asarray. (#7109)
* Implement torch.as_tensor, similar to numpy.asarray.
torch.as_tensor behaves like torch.tensor except it avoids copies if possible; so also somewhat like tensor.new but without the size overloads.
I didn't add a requires_grad field, because we haven't decided on the semantics such as as_param.

* Remove requires_grad for doc.
2018-05-01 12:54:43 -04:00
Edward Z. Yang
0427afadd1
Make AT_ASSERT/AT_ERROR non-printf based, other tweaks (#7104)
* Make AT_ASSERT/AT_ERROR non-printf based, other tweaks

- AT_ASSERT/AT_ERROR don't take printf strings anymore; instead,
  they take a comma-separated list of things you wanted to print
  (bringing it inline with Caffe2's conventions).

  Instead of AT_ASSERT(x == 0, "%d is not zero", x)
  you write AT_ASSERT(x == 0, x, " is not zero")

  This is done by way of a new variadic template at::str(), which
  takes a list of arguments and cats their string reps (as per
  operator<<) together.

- A bunch of the demangling logic that was in Error.h is now
  moved to Error.cpp (better header hygiene.)  Also, demangle
  has been moved out to its own helper function, and also
  a new helper demangle_type (from Caffe2) added.

- A bunch of AT_ASSERT converted into AT_CHECK, to more properly
  convey which checks can be caused by user error, and which are
  due to logic error in ATen.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* CR

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Fix test failure.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* buildfix

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* More fixes.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* One more fix

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Try harder

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2018-05-01 10:28:31 -04:00
gchanan
361648a4a7
Fix torch.tensor(...) device-type calculation when used with numpy an… (#6995)
* Fix torch.tensor(...) device-type calculation when used with numpy and type inference.

* Fix tensor device type inference as well.

* Better variable type inference: infer cuda-ness only if device is not specified.
2018-04-27 18:12:33 -04:00
gchanan
18ed2160b0
Use Index rather than Long for IntList parsing (#6674)
* Use Index rather than Long for IntList, so floating-point types convertible to ints fail the parsing.

Basically, our unpackLong code works with floating-point types that are convertible to ints, but this isn't often what you want (because of truncation).
What you actually want is to convert to an index, which will usually find such issues.

I made this the minimal change I could because:
1) I didn't want to change unpackLong because the existing code call checkLong before unpackLong, so this should be a non-issue most of the time.  And fixing this properly requires calling checkLong again, which will slow everything down.
2) An exception above is with IntList, which only checks that 1) it is a tuple or 2) it is a varargs tuple (i.e. torch.ones(1, 2, 3)).

* Fix bug.

* Don't conflict tensor and IntList bindings.

* Change function to be consistent between python 2 and 3.

* Check Index.

* Move IntList overloads in legacy new functions to below Tensor overloads.
2018-04-26 19:13:23 -04:00
Zachary DeVito
d985cf46f1
Add workaround to fix include warnings in Python 2 builds. (#6716) 2018-04-24 12:30:19 -07:00
gchanan
749d51414a
Separate cuda-ness from dtype. (#6470)
* Separate cuda-ness from dtype.

There are no longer torch.cuda.int64, etc; only torch.int64 that correspond to at::ScalarType.
At the python arg parser level, the corresponding ATen type is selected from the combination of (ScalarType, Layout, Device).

There is also currently unused code in here for support ScalarType in native_functions; this will be used for specifying aggregate types
on reduction functions.

* Fix test_autograd.

* Add defaults to randint_like.

* Track is_cuda in py tensor types.

* Fix test_sparse.

* Fix multiprocessing.

* Fix rnn.

* Fix test_nn.

* Fix flake8.
2018-04-12 14:05:44 -04:00
gchanan
87e369111a
Add string-style devices to all tensors. (#6283)
* Add string-style devices to all tensors.

Previously, tensors only had a 'get_device' method which would throw an exception on a CPU tensor.   This made it necessary to if/else code that
was meant to be device agnostic.

This PR implements the following:
1) Adds a 'device' property to all tensors that returns a string representation of the device for all tensors.
For cpu tensors this is 'cpu'.  For cuda tensors this is 'cuda:X', where X is the cuda device ordinal.

2) Adds a DeviceSpec class.  This is just a helper class for separating device_type and device_index specification and to allow partial specification.
For example, you can call DeviceSpec('cuda'), DeviceSpec('cuda:0'), DeviceSpec('cuda', 1).
Also has backwards compatibility support for specifying integers, which are treated as cuda devices.

DeviceSpecs have the following properties:
a) device_type: string representation of the device type (i.e. 'cpu' or 'cuda')
b) device_index: integer for the device index (None if not specified)
c) cuda_device_index: for backwards compatibility; behaves roughly like `get_device` did previously.  I.e. if a function previously took integers for cuda devices,
it can now take DeviceSpecs (or strings), and can maintain the old functionality by calling `old_index = DeviceSpec(old).cuda_device_index`.

3) tensor methods and torch. functions that took integer devices can now take integers, strings, or DeviceSpecs.  For example:
torch.randn((2,3), dtype=torch.cuda.float32, device='cuda:1')

TODO in future PRs:
A) Split out cuda from dtype so you don't need to overspecify cuda-ness
B) We currently only support strings/DeviceSpecs in tensor methods and torch. functions.  We should have equivalents torch.cuda.device(...), torch.cuda.device_of, etc.
at the torch. level that work on strings/DeviceSpecs

* Add deviceInt64 to python arg parser.

* device_str.

* Remove device_str.

* remove device prefix from attributes.

* Use const char * instead of string.

* Move autogpu index out of Device.

* comment on is_default.

* Rename torch.DeviceSpec to torch.device.

* comment.

* Fix tests.

* Fix flake8.

* Fix sparse_coo_tensor parameter name.

* Improve error message.

* Remove device_ prefix from C++ device object.

* Allocate static strings.

* Return not implemented from rich compare.

* Move torch::Device to THPDevice.

* Remove cuda index.

* Py_RETURN_NOTIMPLEMENTED doesn't exist in python2.
2018-04-06 15:12:05 -04:00
gchanan
4c81282c33
Introduce torch.layout and split layout from dtypes. (#6145)
* Introduce torch.layout and split layout from dtypes.

Tensors (and tensor types) now have a 'layout' attribute that returns either 'torch.strided' or 'torch.sparse_coo'.

Previously, dtypes were 1-to-1 with ATen types/PyTensorTypes; the impetus behind this decision was to make things easy in the common case
(i.e. specifying a type in a factory function).  But this doesn't really follow for sparity, which isn't a common case.

It also doesn't properly represent the concept or a dtype, which in numpy are proper scalar types (i.e. roughly the type returned from indexing the
last dimension of an n-d array).  But this should be the same whether or not the tensor is represented via strides, sparsity, etc.

This is accomplished by:
1) having the dtype of tensor return the (device-type, scalar-type) combination, i.e. torch.cuda.float32, so both
   torch.cuda.FloatTensor and torch.cuda.sparse.FloatTensor have the same dtype
2) Adding a layout parameter to python functions, where the combination of (dtype, layout) maps to an ATen type that is used for dispatch.

* Formatting, make init throw python_error.

* Fix cuda not enabled error message.

* Fix test.
2018-04-02 14:07:50 -04:00
Peter Goldsborough
d42fcdbc96 Add source location information to error messages (#6059) 2018-03-29 22:57:18 +02:00
gchanan
6ae0576e1c
Remove dtypes from legacy tensor.new(...) (#6081)
This is in preparation for splitting out sparsity (layout) from dtypes; it's complex to maintain these
and tensor.new(...) is a legacy API in any case.
2018-03-28 18:37:21 -04:00
gchanan
db53389761
Add numpy.array-like type inference to torch.tensor. (#5997)
* Add numpy.array-like type inference to torch.tensor.

* Temporary fix for int/double types.

* Treat python floats as the default (scalar) dtype.

* Also make 0-length sequences the default scalar type and add more tests.

* Add type inference to sparse_coo_tensor.

* Fix sparse test.

* Remove allow_variables.

* Check numpy platform bits.

* Address review comments.

* Make suggested changes to constraints.

* More checking windows builds.

* Fix test for windows.
2018-03-27 15:27:23 -04:00
Vedanuj Goswami
d441396e47 Fix crash in new tensor with numpy array in CUDA (#5850) 2018-03-17 10:23:02 -04:00
gchanan
c474136ee1
[REDO] Add torch.sparse_coo_tensor factory. (#5781)
* Add torch.sparse_coo_tensor factory.

Notes:
1) I didn't add Tensor.new_sparse_coo_tensor; it didn't seem particularly useful, but it's easy to add
2) This doesn't do the type inference, i.e. torch.sparse_coo_tensor(indices=LongTensor, values=IntTensor)
will return a sparse tensor corresponding to the default type rather than a sparse IntTensor.  We can add
type inference later when we add it to other factories.

* Fix merge.

* Use type_conversion function from python_variable_methods.
2018-03-16 13:58:02 -04:00
Soumith Chintala
e40425fd9b
Revert "Add torch.sparse_coo_tensor factory. (#5745)" (#5780)
This reverts commit 361baa5a48.
2018-03-14 13:30:52 -04:00
gchanan
361baa5a48
Add torch.sparse_coo_tensor factory. (#5745)
Notes:
1) I didn't add Tensor.new_sparse_coo_tensor; it didn't seem particularly useful, but it's easy to add
2) This doesn't do the type inference, i.e. torch.sparse_coo_tensor(indices=LongTensor, values=IntTensor)
will return a sparse tensor corresponding to the default type rather than a sparse IntTensor.  We can add
type inference later when we add it to other factories.
2018-03-14 12:10:07 -04:00
gchanan
f6c708f869 Ensure torch.tensor and Tensor.new_tensor copy numpy data. (#5713) 2018-03-12 16:20:10 -04:00