Commit Graph

37 Commits

Author SHA1 Message Date
Edward Yang
3aeb288e40 Make clang-tidy shut up about Python C API macros.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14480

Reviewed By: goldsborough

Differential Revision: D13235001

fbshipit-source-id: cd7f00b12ed3d9ef0fb0d7bd6c428e21561ec1b6
2018-11-28 13:54:42 -08:00
Edward Yang
e35418b3be New implementations of DeviceGuard, StreamGuard and MultiStreamGuard (with CUDA specializations) (#13342)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13342

This PR introduces a few new concepts:

- DeviceGuardImplInterface, and implementations for CPU and CUDA, which
  provide a generic interface for interfacing with device and stream state,
  without requiring a direct dependency on the code in question.
- InlineDeviceGuard, a general template for generating both specialized
  and dynamically dispatched device guard implementations.  Dynamic
  dispatch is done by specializing it on a VirtualGuardImpl.
- Provide a device-independent DeviceGuard class, which can be used even
  from CPU code. It uses the aforementioned dynamic dispatch.
- CUDA-specialized CUDAGuard class, which doesn't have a dynamic dispatch
  but can only be used from CUDA.
- StreamGuard, which is the same as above, but for streams rather than
  devices.
- Optional variants of all the aforementioned guards, which are a no-op if
  no device/stream is specified
- CUDAMultiStreamGuard, specifically for the case when we want to set
  a device on every guard.

There are some subtle semantic changes, which have been thoroughly documented
in the class definition.

BC-breaking changes:

- Move constructor/assignment have been removed from all device guard
  implementations.
- In some cases where you previously wrote 'set_device' (or 'set_stream'), you now must write
  'reset_device', because if you switch devices/device types, the stream/device on the
  previous device is unset.  This is different from previous behavior.
- CUDAGuard no longer handles streams, or multiple streams.  Use CUDAStreamGuard
  or CUDAMultiStreamGuard as appropriate for your use case.

Reviewed By: dzhulgakov

Differential Revision: D12849620

fbshipit-source-id: f61956256f0b12be754b3234fcc73c2abc1be04e
2018-11-11 12:11:10 -08:00
Bram Wasti
1616587540 Redo jit/type and utils/functional to ATen/core (#13455)
Summary:
This is a redo of the previous move which broke OS X and Windows tests -- RTTI seemed to be broken
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13455

Differential Revision: D12883775

Pulled By: bwasti

fbshipit-source-id: 2b6c65e8150e6f89624c6ee99c389335c6fb4bb8
2018-11-07 18:11:29 -08:00
Edward Yang
c0e24443f7 Revert D10459665: [c10] Redo jit/type and utils/functional to ATen/core
Differential Revision:
D10459665

Original commit changeset: 563dec9987aa

fbshipit-source-id: bea1dac93ebe73c9e09753d641f04f722d80aef7
2018-11-01 07:26:54 -07:00
Bram Wasti
10a6a3e404 Redo jit/type and utils/functional to ATen/core (#12862)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12862

This is a redo of the previous move in a way that doesn't migrate the namespace -- also will check for the windows cudnn build failure

Reviewed By: Yangqing

Differential Revision: D10459665

fbshipit-source-id: 563dec9987aa979702e6d71072ee2f4b2d969d69
2018-10-31 19:57:43 -07:00
Thomas Viehmann
50c0aedbec Don't segfault on Tensor.__delitem__ (#12726)
Summary:
The mapping protocol stipulates that when `__delitem__` is called, this is passed to `__setitem__` [(well, the same function in the C extension interface)](https://docs.python.org/3/c-api/typeobj.html#c.PyMappingMethods.mp_ass_subscript) with NULL data.

PyTorch master crashes in this situation, with this patch, it does not anymore.

Test code (careful, sefaults your interpreter):
```python
import torch
a = torch.randn(5)
del a[2]
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12726

Differential Revision: D10414244

Pulled By: colesbury

fbshipit-source-id: c49716e1a0a3d9a117ce88fc394858f1df36ed79
2018-10-16 17:24:18 -07:00
Yangqing Jia
713e706618 Move exception to C10 (#12354)
Summary:
There are still a few work to be done:

- Move logging and unify AT_WARN with LOG(ERROR).
- A few header files are still being plumbed through, need cleaning.
- caffe2::EnforceNotMet aliasing is not done yet.
- need to unify the macros. See c10/util/Exception.h

This is mainly a codemod and not causing functional changes. If you find your job failing and trace back to this diff, usually it can be fixed by the following approaches:

(1) add //caffe2/c10:c10 to your dependency (or transitive dependency).
(2) change objects such as at::Error, at::Optional to the c10 namespace.
(3) change functions to the c10 namespace. Especially, caffe2::MakeString is not overridden by the unified c10::str function. Nothing else changes.

Please kindly consider not reverting this diff - it involves multiple rounds of rebasing and the fix is usually simple. Contact jiayq@ or AI Platform Dev for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/12354

Reviewed By: orionr

Differential Revision: D10238910

Pulled By: Yangqing

fbshipit-source-id: 7794d5bf2797ab0ca6ebaccaa2f7ebbd50ff8f32
2018-10-15 13:33:18 -07:00
Christian Puhrsch
a9e6a673ae Remove caffe2::Tensor::capacity_nbytes, at::Tensor::to##name##Data, (#11876)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11876

Modern C++ api instead of macros, item() is aligned with Python frontend. caffe2::Tensor::capacity_nbytes is effecitvely unused and confusing w.r.t. caffe2::Tensor::nbytes().

codemod -d caffe2           --extensions cc,cpp,cu,cuh,h,py,hpp,mm toCByte   "item<uint8_t>"
codemod -d caffe2           --extensions cc,cpp,cu,cuh,h,py,hpp,mm toCLong   "item<int64_t>"
codemod -d caffe2           --extensions cc,cpp,cu,cuh,h,py,hpp,mm toCInt    "item<int32_t>"
codemod -d caffe2           --extensions cc,cpp,cu,cuh,h,py,hpp,mm toCDouble "item<double>"
codemod -d caffe2           --extensions cc,cpp,cu,cuh,h,py,hpp,mm toCFloat  "item<float>"

codemod -d caffe2           --extensions cc,cpp,cu,cuh,h,py,hpp,mm toByteData   "data<uint8_t>"
codemod -d caffe2           --extensions cc,cpp,cu,cuh,h,py,hpp,mm toLongData   "data<int64_t>"
codemod -d caffe2           --extensions cc,cpp,cu,cuh,h,py,hpp,mm toIntData    "data<int32_t>"
codemod -d caffe2           --extensions cc,cpp,cu,cuh,h,py,hpp,mm toDoubleData "data<double>"
codemod -d caffe2           --extensions cc,cpp,cu,cuh,h,py,hpp,mm toFloatData  "data<float>"

codemod -d hphp           --extensions cc,cpp,cu,cuh,h,py,hpp,mm toCByte   "item<uint8_t>"
codemod -d hphp           --extensions cc,cpp,cu,cuh,h,py,hpp,mm toCLong   "item<int64_t>"
codemod -d hphp           --extensions cc,cpp,cu,cuh,h,py,hpp,mm toCInt    "item<int32_t>"
codemod -d hphp           --extensions cc,cpp,cu,cuh,h,py,hpp,mm toCDouble "item<double>"
codemod -d hphp           --extensions cc,cpp,cu,cuh,h,py,hpp,mm toCFloat  "item<float>"

codemod -d hphp           --extensions cc,cpp,cu,cuh,h,py,hpp,mm toByteData   "data<uint8_t>"
codemod -d hphp           --extensions cc,cpp,cu,cuh,h,py,hpp,mm toLongData   "data<int64_t>"
codemod -d hphp           --extensions cc,cpp,cu,cuh,h,py,hpp,mm toIntData    "data<int32_t>"
codemod -d hphp           --extensions cc,cpp,cu,cuh,h,py,hpp,mm toDoubleData "data<double>"
codemod -d hphp           --extensions cc,cpp,cu,cuh,h,py,hpp,mm toFloatData  "data<float>"

codemod -d caffe2 --extensions cc,cpp,cu,cuh,h,py,hpp,mm toCComplexDouble "item<std::complex<double>>"

codemod -d tc           --extensions cc,cpp,cu,cuh,h,py,hpp,mm toCFloat  "item<float>"

Reviewed By: ezyang

Differential Revision: D9948572

fbshipit-source-id: 70c9f5390d92b82c85fdd5f8a5aebca338ab413c
2018-09-24 10:40:10 -07:00
David Riazati
1091c5e59f Throw error on indexing a 0 dim tensor (#11679)
Summary:
Following through on warning that indexing 0-dim tensor would be an
error in PyTorch 0.5 and to use `item()` instead
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11679

Reviewed By: soumith

Differential Revision: D9833570

Pulled By: driazati

fbshipit-source-id: ac19f811fa7320d30b7f60cf66b596d6de684d86
2018-09-19 18:10:03 -07:00
Edward Yang
b2217109ec Move TensorOptions to ATen/core
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11147

Reviewed By: gchanan

Differential Revision: D9614321

fbshipit-source-id: 618cb342eb7c52181425f6bb9c17b9ecdb87a394
2018-09-04 08:55:54 -07:00
James Reed
43e73f85ad Dont optimize slicing dispatch when we are tracing (#11156)
Summary:
Previously when we had a slicing expression like `x[0:5, 0]`, where the sliced tensor was of size `5` in dimension 0, we would skip dispatching the actual slice call as an optimization.

This caused incorrect behavior under tracing, as we would not record the slice op and thus if we encountered an input with a different shape while running the trace, we would get incorrect results.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11156

Differential Revision: D9622252

Pulled By: jamesr66a

fbshipit-source-id: 822f2e8f01504e131f53bd9ef51c171c7913a7cc
2018-09-01 17:13:03 -07:00
Elias Ellison
58b145f515 Fix negative indices in tracer (#10560)
Summary:
Previously when tracing slicing & select negative indices would get normalized, fixing the index to the size of the traced tensor. This makes the behavior the same as script so aten::select with negative indices is emitted.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10560

Differential Revision: D9493614

Pulled By: eellison

fbshipit-source-id: ce7a8bae59863723247208d86b9f2948051ccc6c
2018-08-27 15:19:41 -07:00
Gregory Chanan
34c7c56c73 Re-enable empty n-dimensional empty tensor and fix parallel CPU on empty tensors (#10077)
Summary:
This is a combination of https://github.com/pytorch/pytorch/pull/9947 (this was reverted) and https://github.com/pytorch/pytorch/pull/10076.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10077

Differential Revision: D9087491

Pulled By: gchanan

fbshipit-source-id: 9fe9905628000f2ff3e47df32533cd7d1f25a354
2018-07-31 16:43:45 -07:00
Gregory Chanan
6fb9acfc16 Revert empty n-dim and ATen in C2 integration builds
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10064

Differential Revision: D9082082

Pulled By: gchanan

fbshipit-source-id: ae49470f5b4c89b13beb55fd825de1ba05b6a4fa
2018-07-31 07:25:56 -07:00
Gregory Chanan
ce5f0d40b6 Enable n-dimensional empty tensors. (#9947)
Summary:
These could use some autograd tests, which are coming in a later PR, but using them in autograd is probably pretty rare.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9947

Reviewed By: ezyang

Differential Revision: D9032778

Pulled By: gchanan

fbshipit-source-id: fa5a6509d3bac31ea4fae25143e82de62daabfbd
2018-07-30 12:33:17 -07:00
James Reed
7160846c81 Only view() rhs of index_put if we need to (#9424)
Summary:
During tracing (and export) we are now introducing an unnecessary hard-coded view on the RHS of indexed assignments such as `tensor[idxs] = rhs`. This caused a regression in the PyTorch translate models because these expressions appear with variable sizes in the RHS. This change makes it so we only call view if we indeed need to strip leading 1-dimensions
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9424

Reviewed By: colesbury

Differential Revision: D8838881

Pulled By: jamesr66a

fbshipit-source-id: 399e5daa7d021f4f59f6f92b9fae581f92bfc538
2018-07-14 00:10:21 -07:00
Gregory Chanan
f92edf7ef4 N-dimensional empty tensors: indexing, factories, reductions. (#9209)
Summary:
This PR implements and tests N-dimensional empty tensors for indexing, factories, and reductions if compiled with -DUSE_TH_SIZE_ZERO_DIM.

Still remaining to add:
1) TensorShape functions
2) Simple linear algebra functions (matrix multiply variants)
3) Other functions that operate over a dimension (but don't reduce).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9209

Reviewed By: ezyang

Differential Revision: D8751257

Pulled By: gchanan

fbshipit-source-id: 2113374dc7af6caf31a99bf67b3893f130a29e23
2018-07-09 19:40:01 -07:00
Gregory Chanan
f17b9e4cde Fix boolean indexing. (#8920)
Summary:
Booleaning indexing was special cased to handle a single boolean value, but didn't generally work given multiple booleans.
This PR unifies the behavior with slicing.  Note that only 'True' and torch.tensor(True) behave like NumPy due to the lack of n-dimensional empty tensors.
The corresponding tests for false values have been added, but are guarded behind a flag until we add n-dimensional empty tensors.
Closes https://github.com/pytorch/pytorch/pull/8920

Reviewed By: ezyang

Differential Revision: D8661876

Pulled By: gchanan

fbshipit-source-id: 0dc8a45a303aa41f729d04ab8908cfaf2e3ce3d7
2018-07-03 10:24:12 -07:00
Peter Goldsborough
7ccecbbb4e
Create Tensor::options (#8630) 2018-06-19 11:09:01 -07:00
Peter Goldsborough
372d1d6735
Create ATen tensors via TensorOptions (#7869)
* Created TensorOptions

Storing the type in TensorOptions to solve the Variable problem

Created convenience creation functions for TensorOptions and added tests

Converted zeros to TensorOptions

Converted rand to TensorOptions

Fix codegen for TensorOptions and multiple arguments

Put TensorOptions convenience functions into torch namespace too

All factory functions except *_like support TensorOptions

Integrated with recent JIT changes

Support *_like functions

Fix in place modification

Some cleanups and fixes

Support sparse_coo_tensor

Fix bug in Type.cpp

Fix .empty calls in C++ API

Fix bug in Type.cpp

Trying to fix device placement

Make AutoGPU CPU compatible

Remove some auto_gpu.h uses

Fixing some headers

Fix some remaining CUDA/AutoGPU issues

Fix some AutoGPU uses

Fixes to dispatch_tensor_conversion

Reset version of new variables to zero

Implemented parsing device strings

Random fixes to tests

Self review cleanups

flake8

Undo changes to variable.{h,cpp} because they fail on gcc7.2

Add [cuda] tag to tensor_options_cuda.cpp

Move AutoGPU::set_index_from into .cpp file because Windows is stupid and sucks

Fix linker error in AutoGPU.cpp

Fix bad merge conflict in native_functions.yaml

Fixed caffe2/contrib/aten

Fix new window functions added to TensorFactories.cpp

* Removed torch::TensorOptions

Added code to generate wrapper functions for factory methods

Add implicit constructor from Backend to TensorOptions

Remove Var() from C++ API and use torch:: functions

Use torch:: functions more subtly in C++ API

Make AutoGPU::set_device more exception safe

Check status directly in DynamicCUDAHooksInterface

Rename AutoGPU to DeviceGuard

Removed set_requires_grad from python_variables.h and warn appropriately in Variable::set_requires_grad

remove python_default_init: self.type()

Add back original factory functions, but with deprecation warnings

Disable DeviceGuard for a couple functions in ATen

Remove print statement

Fix DeviceGuard construction from undefined tensor

Fixing CUDA device compiler issues

Moved as many methods as possible into header files

Dont generate python functions for deprecated factories

Remove merge conflict artefact

Fix tensor_options_cuda.cpp

Fix set_requires_grad not being checked

Fix tensor_new.h

TEMPORARILY put some methods in .cpp files to see if it solves issues on windows and mac

Fix bug in DeviceGuard.h

Missing includes

TEMPORARILY moving a few more methods into .cpp to see if it fixes windows

Fixing linker errors

* Fix up SummaryOps to use new factories

Undo device agnostic behavior of DeviceGuard

Use -1 instead of optional for default device index

Also move DeviceGuard methods into header

Fixes around device index after optional -> int32_t switch

Fix use of DeviceGuard in new_with_tensor_copy

Fix tensor_options.cpp

* Fix Type::copy(

* Remove test_non_float_params from ONNX tests

* Set requires_grad=False in ONNX tests that use ints

* Put layout/dtype/device on Tensor

* Post merge fixes

* Change behavior of DeviceGuard to match AutoGPU

* Fix C++ API integration tests

* Fix flip functions
2018-06-16 00:40:35 -07:00
li-roy
6869a5f0fb Throw error on 0-length tensor slicing (#7775)
* throw error on 0-length tensor slicing

* return empty tensor instead of throwing error

* make 0 slice work for tuples also

* add tests

* move check to aten

* Address comments
2018-06-14 17:40:51 -04:00
gchanan
361648a4a7
Fix torch.tensor(...) device-type calculation when used with numpy an… (#6995)
* Fix torch.tensor(...) device-type calculation when used with numpy and type inference.

* Fix tensor device type inference as well.

* Better variable type inference: infer cuda-ness only if device is not specified.
2018-04-27 18:12:33 -04:00
Sam Gross
9765bb5f1e Revert "Fix performance regression of simple indexing cases (#6793)" (#6886)
This reverts commit 8a016693c0.
2018-04-23 22:22:12 -04:00
gchanan
8a016693c0
Fix performance regression of simple indexing cases (#6793)
* Fix performance regression on simple cases of indexing

Dispatches to the old kernels

* Adapt JIT test

The test was expected to fail, but due to the change in the previous diff, it would now dispatch to index_select, which succeeds. I modified the function to go through the advanced indexing codepath

* Only do checks once, properly AutoNoGil, AutoGPU.
2018-04-19 23:41:44 -04:00
gchanan
a4ab83045d
Fix cross device indexing for more than 1 cuda device. (#6781)
* Fix cross device indexing for more than 1 cuda device.

Cross device indexing is attempted from ATen, which doesn't work well because ATen doesn't have AutoGPU, etc.
Instead, before dispatching to ATen we do type conversion on the indices; it would probably be better if we
pushed all this down to ATen, but that will take some work.

* Small cleanup.
2018-04-19 22:03:25 -04:00
Sam Gross
aa99aa1cb8
Slice (instead of copy) when indexing by a zero-dim tensor (#6426)
Slice (instead of copy) when indexing by a zero-dim tensor

Fixes #6217
2018-04-10 11:47:22 -04:00
gchanan
f6c708f869 Ensure torch.tensor and Tensor.new_tensor copy numpy data. (#5713) 2018-03-12 16:20:10 -04:00
Sam Gross
48a3349c29
Delete dead Tensor code paths (#5417)
This deletes most of the dead Tensor code paths, including the TensorMethods cwrap and generic/Tensor.cpp.

This also moves the THNN.cwrap/.cpp generation to generate_code which can use ninja if installed.
2018-02-27 17:58:09 -05:00
Peter Goldsborough
2d5fbe6e0d Improve Variable interface (#5127)
* Improve Variable interface

* Address comments from @apaszke and @colesbury

* string ::operator= is not noexcept

* Remove ir.h from tracer_state.h to improve build times

* Make Variable a struct and pack SavedVariable fields

* Implement as_variable_ref

* grad_fn_ptr() -> grad_fn_unsafe()

* Reduce hackiness of set_type hack

* Include variable.h and edge.h in tracer_state.h because it uses them

* class Variable -> struct Variable because Windows cant even

* Make Variable::output_nr uint32_t instead of int

* Add comment about tracing state

* Replaced more static_cast<Variable&> and improve docs

* Remove SavedVariable destructor and construct members in init list

* Clarify docs for Variable

* Variable::set_version -> set_version_counter
2018-02-12 23:26:26 -05:00
Sam Gross
df0a4474c4
Allow and warn when indexing a zero-dim Variable (#5114)
This better maintains backwards compatibility when Tensors and Variables
are merged. For example:

   >>> loss = var.sum().data[0]

Currently, `var.sum().data` is 1-dim so indexing. Once scalars are
enabled and Variable and Tensor are merged it will be zero-dim. This
change allows that expression to continue working (with a warning). In
the future, the canonical way to compute that expression will be:

   >>> loss = float(var.sum())

Or an equivalent alternative:

   >>> loss = var.sum().item()

Also fixes a few error cases.
2018-02-12 17:50:19 -05:00
Peter Goldsborough
f38b6f611e Replace NULL with nullptr in autograd (#5162) 2018-02-12 12:01:52 -08:00
Sam Gross
895aebac08
Use Variable instead of Tensor in Function.forward (#4786)
The Tensor and Variable classes are being merged.
autograd.Function.forward is now called on Variables, but with "no-grad"
mode (torch.no_grad()) enabled.

One benefit is that we no longer have to explicitly track shared
storages.
2018-02-06 17:24:27 -05:00
gchanan
2648428986 Various indexing fixes around scalars. (#4853)
1) Have 0-dim byte tensors behave like Py_TRUE, Py_FALSE
1) Py_TRUE now properly returns a copy from getitem
3) setitem now properly shapes the LHS consistent with the RHS (this doesn't really matter outside of error messages having the proper shape)
4) setitem supports numpy-style copy_to broadcasting (cuts off prefix 1s from src), so e.g. you can setitem (1,1,2,3) to (2,3) even though
   that doesn't follow the normal inplace broadcasting rules.
2018-01-25 14:05:14 -05:00
Sam Gross
db6be0e1f1 Fix call to THPUtils_parseSlice (#4732)
* Fix call to THPUtils_parseSlice

THPUtils_parseSlice returns a bool

* Add Variable.__index__

* Add test
2018-01-19 09:39:26 -05:00
gchanan
e426020c87
Move prod, cumprod backwards to C++ (#4394)
* Add view_as as a native_function.

* Move prod, cumprod backwards to C++.

* Update for review requets.

* Review comments.

* Reorder slice parameters so dim is first.

* Update test_slice.

* Update test_autograd.

* Fix flake8.
2018-01-03 16:27:50 -05:00
peter
c7b1d58b16 Fix THP_export for python_variable_indexing.cpp 2017-11-22 17:12:20 +01:00
Sam Gross
4518793aa2
Implement indexing in ATen (#3725)
Implements basic and advanced indexing using ATen tensors/variables.
Basic indexing is translated at the Python-binding level
(python_variable_indexing.cpp) to slice/squeeze/unsqueeze/select calls.
Advanced indexing is implemented in ATen in terms of take() and put()
calls.
2017-11-21 13:19:00 -05:00