Commit Graph

72 Commits

Author SHA1 Message Date
Adam Paszke
90e31f4896 Improve tracer warnings (#11545)
Summary:
Also, fix a performance bug in `ensureUnique`. Previously it formatted the warning string even though we weren't tracing, so all that work would *always* happen in the hot path and be for nothing.

A sample of how the new warnings look like:
```
tmp.py:4: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Pytho
n values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  int(x)
tmp.py:5: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this fun
ction to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might caus
e the trace to be incorrect.
  torch.tensor([1.])
tmp.py:6: TracerWarning: There are 2 live references to the data region being modified when tracing in-place operator add_. This might cause t
he trace to be incorrect, because all other views that also reference this data will not not reflect this change in the trace! On the other ha
nd, if all other views use the same memory, but are disjoint (e.g. are outputs of torch.split), this might still be safe.
  torch.split(y, 2, dim=1)[0].add_(2)

```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11545

Differential Revision: D9782975

Pulled By: apaszke

fbshipit-source-id: 5b3abd31366e59c69e0b7ff278042b5563deb5a9
2018-09-11 22:10:32 -07:00
Adam Paszke
62c9d4ac96 Make .to() methods native functions (to fix JIT tracing)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11491

Differential Revision: D9771121

Pulled By: apaszke

fbshipit-source-id: 08d11101fb12093f8cf913b06359adddf3af9da7
2018-09-11 21:55:42 -07:00
Edward Yang
ac9268f25d Conversions to and from complex numbers. (#11420)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11420

Surprisingly tricky!  Here are the major pieces:

- We grow a even yet more ludicrous macro
  AT_FORALL_SCALAR_TYPES_WITH_COMPLEX_EXCEPT_COMPLEX_HALF
  which does what it says on the tin.  This is because I was
  too lazy to figure out how to define the necessary conversions
  in and out of ComplexHalf without triggering ambiguity problems.
  It doesn't seem to be as simple as just Half.  Leave it for
  when someone actually wants this.

- Scalar now can hold std::complex<double>.  Internally, it is
  stored as double[2] because nvcc chokes on a non-POD type
  inside a union.

- overflow() checking is generalized to work with complex.
  When converting *to* std::complex<T>, all we need to do is check
  for overflow against T.  When converting *from* complex, we
  must check (1) if To is not complex, that imag() == 0
  and (2) for overflow componentwise.

- convert() is generalized to work with complex<->real conversions.
  Complex to real drops the imaginary component; we rely on
  overflow checking to tell if this actually loses fidelity. To get
  the specializations and overloads to work out, we introduce
  a new Converter class that actually is specializable.

- Complex scalars convert into Python complex numbers

- This probably fixes complex tensor printing, but there is no way
  to test this right now.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Reviewed By: cpuhrsch

Differential Revision: D9697878

Pulled By: ezyang

fbshipit-source-id: 181519e56bbab67ed1e5b49c691b873e124d7946
2018-09-08 16:39:43 -07:00
James Reed
03c06ec93d Traceable detach (#11038)
Summary:
This makes it so `detach` and `detach_` are traceable and also adds a pass to erase them before ONNX export
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11038

Differential Revision: D9588038

Pulled By: jamesr66a

fbshipit-source-id: 263dd3147e24fcb0c716743f37fdb9f84c0015e7
2018-08-31 16:40:42 -07:00
Adam Paszke
780d2792c5 Warn about non-traceable behavior when tracing (#11088)
Summary:
zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11088

Differential Revision: D9585527

Pulled By: apaszke

fbshipit-source-id: 29a03cb152d83b626f748fff4501ac9e139994c2
2018-08-31 14:27:00 -07:00
Edward Yang
750ede7215 Rename getType to getVariableTypeFromBaseType / getVariableType (#11095)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11095

We used getType to mean a lot of things.

- getVariableTypeFromBaseType: given a base Type (non-Variable type)
  compute the Variable Type which corresponds to it.

- getVariableType: like at::getType, but return the Variable type
  rather than the plain type.

This rename makes it clearer at the use-site what things are what,
and will make a subsequent rename of at::getType easier.

Reviewed By: gchanan, cpuhrsch

Differential Revision: D9583630

fbshipit-source-id: 2667ec98e7607bc466920c7415a8c651fd56dfca
2018-08-30 20:11:25 -07:00
Edward Yang
19031c68dc Use intrusive_ptr in Storage; replace unique_ptr<Storage> with Storage (#10488)
Summary:
```
Use intrusive_ptr in Storage; replace unique_ptr<Storage> with Storage

This patch does two major changes:

- It replaces the use of Retainable in Storage with a new implementation
  based on intrusive_ptr.  This will be necessary because Caffe2 will
  be using this class to implement intrusive_ptrs, and we need to
  line these up for the merge.  One good thing about the new implementation is
  that the default copy/move constructors/assignment operators and destructor
  work automatically, instead of needing to be hardcoded into Storage/Tensor.

- It replaces all places where we returned std::unique_ptr<Storage> with
  Storage, collapsing an unnecessary double indirection that is no longer
  necessary now that we have correctly working copy/move constructors.

I didn't initially want to do step (2), but it was very important to
eliminate all bare uses of new Storage and new StorageImpl, and this making
the API change was the most straightforward way to do this.

HOW TO FIX YOUR CODE IN THE NEW API

- You no longer need to dereference the result of tensor.storage() to pass
  it to set.  So, instead of:

      x.set_(*y.storage());

  just write:

      x.set_(y.storage());

- If you were accessing methods on StorageImpl via the pImpl() method, you
  must use the dot operator to run pImpl().  Even better; just drop pImpl,
  we now have method forwarding.  So, instead of:

      storage->pImpl()->data();

  just do:

      storage->data();
      // storage.pImpl()->data() works too but is not as recommended

- storage->getDevice() is no more; instead use storage->device().index()

MISC CODE UPDATES

- retain, release, weak_retain, weak_release and weak_lock are now
  reimplemented using the "blessed API", and renamed to make it
  clearer that their use is discouraged.

- nvcc OS X and general OS X portability improvements to intrusive_ptr

- A new comment in intrusive_ptr describing how stack allocated
  intrusive_ptr_targets work differently than heap allocated ones
  from c10::make_intrusive

CAVEAT EMPTOR

- THStorage_weakRetain used to work on strong pointers, but it NO LONGER
  works with intrusive_ptr.  You must reclaim the strong pointer into a
  real strong pointer, construct a weak pointer from it, and then release
  the strong and weak pointers.  See StorageSharing.cpp for an example.
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10488

Reviewed By: gchanan

Differential Revision: D9306134

Pulled By: ezyang

fbshipit-source-id: 02d58ef62dab8e4da6131e1a24834a65c21048e2
2018-08-21 21:39:55 -07:00
Edward Yang
6bdbad93b9 Refactor Device to not depend on Backend. (#10478)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10478

- Removed Backend constructor from Device, and fixed all
  use-sites to use DeviceType::CPU instead of kCPU, or
  use a new function backendToDeviceType to perform
  the conversion.
- New method device_type() on Type; it gives you the
  underlying device type, e.g., CPU for SparseCPU.
- We add backward compatibility for kCPU/kCUDA uses,
  by introducing a new special type which is implicitly
  convertible to both DeviceType and Backend.  As long as
  you don't define a function that's overloaded on both
  DeviceType and Backend (but not on BackendOrDeviceType),
  the implicit conversions will ensure that uses
  of at::Device(at::kCPU) keep working. We fixed use-sites in
  the library, but did NOT fix sites in the test code, so that
  we can exercise this BC code.

Reviewed By: Yangqing

Differential Revision: D9301861

fbshipit-source-id: 9a9d88620500715c7b37e655b4fd761f6dd72716
2018-08-18 17:39:14 -07:00
Sebastian Messmer
f51f15bb27 Update include paths for ATen/core (#10130)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10130

Update some include paths to make them internally consistent

Reviewed By: ezyang

Differential Revision: D9119906

fbshipit-source-id: b44e5cab8e8e795ee18afe9ffc6caf1f2b413467
2018-08-03 11:57:02 -07:00
Wanchao Liang
47c1badf90 Fix the clamp special case and gradient problem on None, add None to JIT (#9596)
Summary:
Supersedes #8925

This PR fixes #8502, it fixes the gradients problem for clamp when passing None to the function, and add support for the NoneLiteral and NoneType in script to enable clamp tests. Now we could have corner cases like:

```python
torch.jit.script
def func():
    x = torch.randn(3, 3, requires_grad=True)
    y = torch.clamp(x, None, 0) # max = 0
    y = torch.clamp(x, min=None, max=0)
```

In both JIT and Aten, we use Scalar(NAN) as a sentinel value when passing None type to function clamp, this is the current way we used to support None type in JIT and to solve the gradient problem when user explicitly passing None into clamp.

In JIT side, we create a tensor(NAN) and undefinedTensor if we encounter None when matching the function schema, and later in the interpreter, it will translate to Scalar(NAN) if needed.

Ideally we don't need clamp_min and clamp_max in ATenNative/Autograd and could only support clamp after this change, but since bunch of other operators (e.g. Activation.cpp, Loss.cpp) is using clamp_min in several places, we will still have the functions available, but all python invocations will only call clamp instead of clamp_min/max (with calling underlying th_max/th_min in clamp).

zdevito jamesr66a
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9596

Reviewed By: zdevito

Differential Revision: D8940839

Pulled By: wanchaol

fbshipit-source-id: c543a867b82e0ab8c99384773b173fdde2605d28
2018-07-27 22:54:33 -07:00
Adam Paszke
aa7af94656 Make JIT tracing a thread-local property (#9414)
Summary:
As in the title. Lets us simplify a lot of code.

Depends on #9363, so please review only the last commit.

zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9414

Reviewed By: zdevito

Differential Revision: D8836496

Pulled By: apaszke

fbshipit-source-id: 9b3c3d1f001a9dc522f8478abc005b6b86cfa3e3
2018-07-19 19:09:39 -07:00
Peter Goldsborough
372d1d6735
Create ATen tensors via TensorOptions (#7869)
* Created TensorOptions

Storing the type in TensorOptions to solve the Variable problem

Created convenience creation functions for TensorOptions and added tests

Converted zeros to TensorOptions

Converted rand to TensorOptions

Fix codegen for TensorOptions and multiple arguments

Put TensorOptions convenience functions into torch namespace too

All factory functions except *_like support TensorOptions

Integrated with recent JIT changes

Support *_like functions

Fix in place modification

Some cleanups and fixes

Support sparse_coo_tensor

Fix bug in Type.cpp

Fix .empty calls in C++ API

Fix bug in Type.cpp

Trying to fix device placement

Make AutoGPU CPU compatible

Remove some auto_gpu.h uses

Fixing some headers

Fix some remaining CUDA/AutoGPU issues

Fix some AutoGPU uses

Fixes to dispatch_tensor_conversion

Reset version of new variables to zero

Implemented parsing device strings

Random fixes to tests

Self review cleanups

flake8

Undo changes to variable.{h,cpp} because they fail on gcc7.2

Add [cuda] tag to tensor_options_cuda.cpp

Move AutoGPU::set_index_from into .cpp file because Windows is stupid and sucks

Fix linker error in AutoGPU.cpp

Fix bad merge conflict in native_functions.yaml

Fixed caffe2/contrib/aten

Fix new window functions added to TensorFactories.cpp

* Removed torch::TensorOptions

Added code to generate wrapper functions for factory methods

Add implicit constructor from Backend to TensorOptions

Remove Var() from C++ API and use torch:: functions

Use torch:: functions more subtly in C++ API

Make AutoGPU::set_device more exception safe

Check status directly in DynamicCUDAHooksInterface

Rename AutoGPU to DeviceGuard

Removed set_requires_grad from python_variables.h and warn appropriately in Variable::set_requires_grad

remove python_default_init: self.type()

Add back original factory functions, but with deprecation warnings

Disable DeviceGuard for a couple functions in ATen

Remove print statement

Fix DeviceGuard construction from undefined tensor

Fixing CUDA device compiler issues

Moved as many methods as possible into header files

Dont generate python functions for deprecated factories

Remove merge conflict artefact

Fix tensor_options_cuda.cpp

Fix set_requires_grad not being checked

Fix tensor_new.h

TEMPORARILY put some methods in .cpp files to see if it solves issues on windows and mac

Fix bug in DeviceGuard.h

Missing includes

TEMPORARILY moving a few more methods into .cpp to see if it fixes windows

Fixing linker errors

* Fix up SummaryOps to use new factories

Undo device agnostic behavior of DeviceGuard

Use -1 instead of optional for default device index

Also move DeviceGuard methods into header

Fixes around device index after optional -> int32_t switch

Fix use of DeviceGuard in new_with_tensor_copy

Fix tensor_options.cpp

* Fix Type::copy(

* Remove test_non_float_params from ONNX tests

* Set requires_grad=False in ONNX tests that use ints

* Put layout/dtype/device on Tensor

* Post merge fixes

* Change behavior of DeviceGuard to match AutoGPU

* Fix C++ API integration tests

* Fix flip functions
2018-06-16 00:40:35 -07:00
Soumith Chintala
dc186cc9fe
Remove NO_* and WITH_* across codebase, except in setup.py (#8555)
* remove legacy options from CMakeLists

* codemod WITH_ to USE_ for WITH_CUDA, WITH_CUDNN, WITH_DISTRIBUTED, WITH_DISTRIBUTED_MW, WITH_GLOO_IBVERBS, WITH_NCCL, WITH_ROCM, WITH_NUMPY

* cover SYSTEM_NCCL, MKLDNN, NNPACK, C10D, NINJA

* removed NO_* variables and hotpatch them only in setup.py

* fix lint
2018-06-15 12:29:48 -04:00
Tongzhou Wang
c0a419e6ba
Add non_blocking to Tensor/Module.to (#7312)
* Add non_blocking to Tensor/Module.to

* flake8

* Add argparse tests

* cpp parse

* Use C++ parser

* use a commong parse function with Tensor.to

* fix test_jit

* use THPObjectPtr

* increase refcount for None, True, and False

* address comments

* address comments
2018-06-04 18:46:52 -04:00
Sam Gross
6c7a8318c4
Fix Tensor.type(dtype) not preserving device (#7474)
Note that Tensor.cuda() will stil copy the tensor to the current device
if it's a CUDA tensor on a different device.

Fixes #7441
2018-05-10 18:22:13 -04:00
Adam Paszke
0829d4502d
Trace size-dependent expressions correctly (#6554)
This makes the JIT tracer much more robust, by allowing it to record
dependencies on tensor sizes. For example, if you were to trace this
function

def fn(x):
    return x.view(x.size(1), -1)

before this patch, then it would embed the actual value of x.size(1)
in the trace as a constant, making it very hard to have e.g. batch size
independent traces. Now, this will correctly record the dependency, and
will retrieve the size of x at every run.
2018-05-04 10:55:39 +02:00
Thomas Viehmann
8fbab83c2a only Tensors of floating point dtype can require gradients (see #7021) (#7034) 2018-04-30 10:20:00 +02:00
gchanan
d0b0edf27a
Add a requires_grad_() function to tensors. (#6771) 2018-04-19 13:47:24 -04:00
Tongzhou Wang
892be8b779
Make dtype in .to positional rather than kwarg only (#6628) 2018-04-16 14:03:40 -04:00
gchanan
46374ad5c8
Add tensor.to(device) method. (#6588)
* Add tensor.on(device) and tensor.on_device_as(tensor) methods.

* Rename {'on', 'on_device_as'} -> 'to'.

* Fix test ordinal.

* Fix device ordinal again.
2018-04-16 10:50:34 -04:00
gchanan
749d51414a
Separate cuda-ness from dtype. (#6470)
* Separate cuda-ness from dtype.

There are no longer torch.cuda.int64, etc; only torch.int64 that correspond to at::ScalarType.
At the python arg parser level, the corresponding ATen type is selected from the combination of (ScalarType, Layout, Device).

There is also currently unused code in here for support ScalarType in native_functions; this will be used for specifying aggregate types
on reduction functions.

* Fix test_autograd.

* Add defaults to randint_like.

* Track is_cuda in py tensor types.

* Fix test_sparse.

* Fix multiprocessing.

* Fix rnn.

* Fix test_nn.

* Fix flake8.
2018-04-12 14:05:44 -04:00
gchanan
87e369111a
Add string-style devices to all tensors. (#6283)
* Add string-style devices to all tensors.

Previously, tensors only had a 'get_device' method which would throw an exception on a CPU tensor.   This made it necessary to if/else code that
was meant to be device agnostic.

This PR implements the following:
1) Adds a 'device' property to all tensors that returns a string representation of the device for all tensors.
For cpu tensors this is 'cpu'.  For cuda tensors this is 'cuda:X', where X is the cuda device ordinal.

2) Adds a DeviceSpec class.  This is just a helper class for separating device_type and device_index specification and to allow partial specification.
For example, you can call DeviceSpec('cuda'), DeviceSpec('cuda:0'), DeviceSpec('cuda', 1).
Also has backwards compatibility support for specifying integers, which are treated as cuda devices.

DeviceSpecs have the following properties:
a) device_type: string representation of the device type (i.e. 'cpu' or 'cuda')
b) device_index: integer for the device index (None if not specified)
c) cuda_device_index: for backwards compatibility; behaves roughly like `get_device` did previously.  I.e. if a function previously took integers for cuda devices,
it can now take DeviceSpecs (or strings), and can maintain the old functionality by calling `old_index = DeviceSpec(old).cuda_device_index`.

3) tensor methods and torch. functions that took integer devices can now take integers, strings, or DeviceSpecs.  For example:
torch.randn((2,3), dtype=torch.cuda.float32, device='cuda:1')

TODO in future PRs:
A) Split out cuda from dtype so you don't need to overspecify cuda-ness
B) We currently only support strings/DeviceSpecs in tensor methods and torch. functions.  We should have equivalents torch.cuda.device(...), torch.cuda.device_of, etc.
at the torch. level that work on strings/DeviceSpecs

* Add deviceInt64 to python arg parser.

* device_str.

* Remove device_str.

* remove device prefix from attributes.

* Use const char * instead of string.

* Move autogpu index out of Device.

* comment on is_default.

* Rename torch.DeviceSpec to torch.device.

* comment.

* Fix tests.

* Fix flake8.

* Fix sparse_coo_tensor parameter name.

* Improve error message.

* Remove device_ prefix from C++ device object.

* Allocate static strings.

* Return not implemented from rich compare.

* Move torch::Device to THPDevice.

* Remove cuda index.

* Py_RETURN_NOTIMPLEMENTED doesn't exist in python2.
2018-04-06 15:12:05 -04:00
Sam Gross
6b3a4637d6
Make the tensor type torch.Tensor instead of torch.autograd.Variable (#5785)
This changes type(tensor) to return `torch.Tensor` instead of
`torch.autograd.Variable`.

This requires a few implementation changes:

 - torch.Tensor is now a regular Python class instead of a
   pseudo-factory like torch.FloatTensor/torch.DoubleTensor
 - torch.autograd.Variable is just a shell with a __new__ function.
   Since no instanes are constructed it doesn't have any methods.
 - Adds torch.get_default_dtype() since torch.Tensor.dtype returns
   <attribute 'dtype' of 'torch._C._TensorBase' objects>
2018-04-03 16:29:25 -04:00
gchanan
4c81282c33
Introduce torch.layout and split layout from dtypes. (#6145)
* Introduce torch.layout and split layout from dtypes.

Tensors (and tensor types) now have a 'layout' attribute that returns either 'torch.strided' or 'torch.sparse_coo'.

Previously, dtypes were 1-to-1 with ATen types/PyTensorTypes; the impetus behind this decision was to make things easy in the common case
(i.e. specifying a type in a factory function).  But this doesn't really follow for sparity, which isn't a common case.

It also doesn't properly represent the concept or a dtype, which in numpy are proper scalar types (i.e. roughly the type returned from indexing the
last dimension of an n-d array).  But this should be the same whether or not the tensor is represented via strides, sparsity, etc.

This is accomplished by:
1) having the dtype of tensor return the (device-type, scalar-type) combination, i.e. torch.cuda.float32, so both
   torch.cuda.FloatTensor and torch.cuda.sparse.FloatTensor have the same dtype
2) Adding a layout parameter to python functions, where the combination of (dtype, layout) maps to an ATen type that is used for dispatch.

* Formatting, make init throw python_error.

* Fix cuda not enabled error message.

* Fix test.
2018-04-02 14:07:50 -04:00
gchanan
c474136ee1
[REDO] Add torch.sparse_coo_tensor factory. (#5781)
* Add torch.sparse_coo_tensor factory.

Notes:
1) I didn't add Tensor.new_sparse_coo_tensor; it didn't seem particularly useful, but it's easy to add
2) This doesn't do the type inference, i.e. torch.sparse_coo_tensor(indices=LongTensor, values=IntTensor)
will return a sparse tensor corresponding to the default type rather than a sparse IntTensor.  We can add
type inference later when we add it to other factories.

* Fix merge.

* Use type_conversion function from python_variable_methods.
2018-03-16 13:58:02 -04:00
James Reed
55af142b44 Traceable dispatch for cast methods (#5629)
Previously, methods like int() and long() would fail tracing because they eventually dispatch down to toType, which takes a Type as a parameter. We don't (currently) support tracing ops with Type inputs[0], so this PR adds specializations for the ATen scalar types and dispatches to those directly. These specialized ops can be traced into the IR without needing a Type argument.

A more long-term solution would be to add support for Types in the IR.

* Traceable dispatch for Variable cast methods

* Add ONNX symbolics

* Fix test

* Fix cross-backend copy issue

* Prepend underscores to cast identifiers

* Metaprogram symbolics

* clang-format

* stupid lint

* Add comments for all code fragments
2018-03-12 19:01:14 -04:00
gchanan
ae0c04c773
Add torch.empty, torch.full and new_ size Tensor factory methods. (#5668)
* Add torch.empty, torch.full and new_ size Tensor factory methods.

This adds torch.full, torch.empty equivalents of np.full, np.empty.
In addition, this adds size-based Tensor factory methods new_empty, new_ones, new_full, new_zeros,
which is meant to complete the separation of the legacy "new" method into data-based and size-based
functions.

This also fixes an issue in sparse zeros_like when the dtype didn't match the argument dtype.

* Get rid of unnecessary zero in sparse tensor zeros_like.

* Fix test if only 1 cuda device.
2018-03-09 15:29:29 -05:00
peterjc123
377d896969 better solution for the linking error related to lazy_init for MSVC (#5375)
* Revert "Fix wrong argument name (#5366)"

This reverts commit cc9d3b265d.

* Fix wrong argument naming

* Revert "Wrap torch::cuda::lazy_init with WITH_CUDA flag"

This reverts commit a8fa37f8fac5aef09eb7fe54d84de6126618c262.

* Revert "Solves the linking error related to lazy_init for MSVC"

This reverts commit 63913a102f274865a76e7c40ffdf6b40c277d5ff.

* better solution for the linking error related to lazy_init for MSVC

* Naming changes

* Namespace changes and further comment

* Rebasing onto current master

* Remove code that is useless

* Fix linting

* Remove rebasing bugs
2018-02-28 17:34:34 -05:00
Sam Gross
ebd32f7bcd
Check that parsed_args contains enough space for all parameters (#5467) 2018-02-28 14:34:04 -05:00
gchanan
6ab33a820c
Support type conversion via type(dtype). (#5441)
* Support type conversion via type(dtype).

* Merge overloads.
2018-02-28 13:05:38 -05:00
gchanan
94938be367
Support dtypes in legacy new constructors. (#5343)
* Support dtypes in legacy new constructors.

* Add comment about why we don't have dtype for sparse (indices, values).

* separate legacy tensor ctor vs new (new includes dtypes).

* Use TypeError.
2018-02-28 12:52:11 -05:00
gchanan
0250b57978
Avoid extra cpu->cpu copy in dispatch_type. (#5418)
* Avoid extra cpu->cpu copy in dispatch_type.

* Simplify cases.
2018-02-26 17:56:45 -05:00
peterjc123
6c587e9e67 Solves the linking error related to lazy_init for MSVC (#5368)
* Revert "Fix wrong argument name (#5366)"

This reverts commit cc9d3b265d.

* Solves the linking error related to lazy_init for MSVC

* Fix wrong argument naming

* Wrap torch::cuda::lazy_init with WITH_CUDA flag
2018-02-23 11:08:20 -05:00
gchanan
0878c6d4d7
Various dtype improvements. (#5321)
* Various dtype improvements.

1) Add dtypes to the new data-based constructors: Variable.new_tensor and torch.autograd.variable.
2) In the python signatures, use Type instead of Dtype to match	the C++ signatures; the error messages still print as dtype.
3) Handle / add a better error message when a dtype is used when ATen was not compiled with that type (e.g. cuda types).
4) Move cuda_lazy_init to its own file.

A later commit will add support to the legacy constructors as well.

* Move implementation of lazy_init to cpp.

* Fix parsed_arg size.
2018-02-21 17:37:59 -05:00
Edward Z. Yang
2b2d56d846
Add missing async deprecated wrapper to tools/autograd/templates/python_variable_methods.cpp (#5196)
Signed-off-by: Edward Z. Yang <ezyang@cs.stanford.edu>
2018-02-12 23:29:35 -08:00
gchanan
6a9b7132ec
Add a new_tensor instance method to Variable that takes only data. (#5144)
* Add a new_tensor instance method to Variable that takes only data.

This is to work around the legacy problems of new, where e.g.
new(5) will give you an unfilled tensor rather than a scalar.

* Remove double return.

* Fix cuda scalar code path.

* Work around lack of WITH_SCALARS.
2018-02-09 10:59:15 -05:00
Sam Gross
c4d3f69053
Add Variable.item() (#5090)
Variable.item() converts one-element tensors to standard Python numbers.
This operates like float(var) or int(var) depending on
the data type of the Variable.
2018-02-06 17:15:53 -05:00
gchanan
67ff50c30d
Run test_nn criterion tests over Variables, add a scalar test (#5058)
* test_nn working.

* Fix some incorrect scalar assumptions.

* Don't use Variables when we don't have to.

* Use Variable Mixin.

* Fix NLLLoss reference function when WITH_SCALARS not enabled.

* Allow device to be optional in cuda().

* Fix multilabelmarginloss_reference.
2018-02-06 11:11:18 -05:00
Peter Goldsborough
86fd5fd524 Replace async with non_blocking for Python 3.7 (#4999)
* Replace async with non_blocking for Python 3.7 upgrade

* Remove trailing whitespace

* Give _cuda and _type kwargs and accept async for compatibility

* Rename async to non_blocking in all C++ code

* Add entries for async in python_variable_methods

* Friendlier backward compatibility for cuda and type
2018-02-02 09:23:51 -05:00
gchanan
9bb6d33d35
Enable scalars if compiled with WITH_SCALAR environment variable. (#4806)
* Enable scalars if compiled with WITH_SCALAR environment variable.

We are pretty close to enabling scalars (0-dimensional arrays); this allows turning them on
for development purposes and to be able to write code that works both with and without scalars enabled.

WITH_SCALARS is currently broken with distributions, but should work for test_torch, test_autograd, test_nn.

* Fix unsqueeze.

* Fix wrap dim, wrapping with Scalar.
2018-01-23 15:44:11 -05:00
Sam Gross
870ef8e95f
Implement record_stream on Variable (#4728)
The function record_stream is currently only defined on Tensor in
TensorCuda.cwrap. It would be best to implement this in ATen and
automatically bind it to Python, but we're missing ATen types to
represent CUDA streams.
2018-01-19 10:58:13 -05:00
Sam Gross
db6be0e1f1 Fix call to THPUtils_parseSlice (#4732)
* Fix call to THPUtils_parseSlice

THPUtils_parseSlice returns a bool

* Add Variable.__index__

* Add test
2018-01-19 09:39:26 -05:00
Sam Gross
57549b7e44
Bind functions with out= arguments in VariableType (#4565)
This adds overrides in VariableType for the xxx_out ATen functions and
implements Python bindings. There is no support for automatic
differentiation. If any of the inputs (or outputs) requires grad, then the
function will throw an exception unless it's running in "no-grad" mode.

The bindings for calling torch.xxx functions on Variables are moved to a
different object. Previously, they were static method on VariableBase.
This change prevents users from accidentally calling static methods as if
they were instance methods.
2018-01-17 18:27:42 -05:00
gchanan
eb857ec367
Introduce a (non-public) autograd scalar method and improve printing (#4586)
* Specialize Variable pinting and always print device for GPU tensors/Variables.

* Introduce a (non-public) _scalar_sum() method for autograd scalar testing.
2018-01-12 14:26:38 -05:00
Sam Gross
1632ab2979
Fix default device for Variable.new() (#4307)
Variable.new() should default to the device of "self" if no device is
specified. Previously, we were using the current device. This now
matches Tensor.new().
2017-12-21 18:35:35 -05:00
Sam Gross
bec0349280 Implement Variable.cuda and Variable.type using ATen (#4139)
* Implement Variable.cuda using ATen

This adds an optional async flag to Tensor::copy_, which attempts to do
a non-blocking copy if the one of the tensors is in pinned memory and
the other is a CUDA tensor.

* Perform cross-device copy in CopyBackwards

Also call torch.cuda._lazy_init() from Variable.cuda()

* Implement Variable.type via ATen

* Changes from review:

 - remove copy_out
 - remove unnecessary include
 - fix default device for .cuda()

* Combine if statements in dispatch_type
2017-12-18 01:54:35 -05:00
Sam Gross
aeb7a3668d
Implement Variable.new (#4080) 2017-12-11 15:45:43 -05:00
Sam Gross
75c11d62b7
Implement Variable.__invert__ (#4082) 2017-12-08 13:05:51 -05:00
Sam Gross
60c03bc09c
Implement apply_, map_, and map2_ in Variable (#4057) 2017-12-07 14:48:56 -05:00
Sam Gross
d0cabbde74
Implement Variable.from_numpy (#4043)
Implements from_numpy using ATen tensors. Variable.from_numpy is a
convenient placeholder for the variant that returns Variables until we
merge Tensor and Variable.

The behavior is slightly changed:

 - from_numpy() on an empty array now returns an empty tensor instead of
   throwing an exception. The shape may not be preserved.
 - CharTensor(ndarray) used to throw an exception. It now copies the
   ndarray. Copying is implemented via ATen toType.
2017-12-06 14:08:56 -05:00