pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Edward Yang	013ca32730	Devirtualize numel() (#27294 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27294 Fixes #27291 I'm a little annoyed that I have to reintroduce manual binding code. But it's probably not a good idea to teach the codegen how to do fastpath functions (is it?) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D17763486 Pulled By: ezyang fbshipit-source-id: 5793b53e2db80b044e57faae325a95c649d9d459	2019-10-09 11:43:50 -07:00
Brian Vaughan	0c6a18de8d	Add torch.promote_types function Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26655 Test Plan: Imported from OSS Differential Revision: D17556196 Pulled By: nairbv fbshipit-source-id: eeebce8968bfb2ffd25c066595bc19e5dee6ea6f	2019-09-27 16:48:38 -07:00
Ralf Gommers	1b4951d3a5	Fix remaining invalid function cast warnings that show up with GCC 8/9 (#26104 ) Summary: Follow-up to gh-25483, more of the same fixes for warnings like: ``` ../torch/csrc/autograd/python_variable.cpp:503:31: warning: cast between incompatible function types from ‘PyObject* ()(THPVariable)’ {aka ‘_object* ()(THPVariable)’} to ‘getter’ {aka ‘_object* ()(_object, void*)’} [-Wcast-function-type] 503 \| {"_backward_hooks", (getter)THPVariable_get_backwards_hooks, (setter)THPVariable_set_backwards_hooks, nullptr, nullptr}, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``` This takes the build log output for a full rebuild with GCC 9.1 from ~10,000 to ~7,000 lines. `clang-tidy` is going to complain, no way around that - see discussion at the end of gh-25483. Pull Request resolved: https://github.com/pytorch/pytorch/pull/26104 Differential Revision: D17396831 Pulled By: ezyang fbshipit-source-id: d71696bfe4dbe25519e4bcb7753151c118bd39f7	2019-09-17 07:43:37 -07:00
Lara Haidar	8ca93ec351	Fix torch.arange traced as constant (#25363 ) Summary: torch.arange is always traced as a constant which makes it impossible to trace correctly TestModel() from the example below. class TestModel(torch.nn.Module): def forward(self, input): return torch.arange(input.shape[0]) input = torch.randn(5,3,2) print(torch.jit.trace(TestModel(), input).graph) Currently the trace of TestModel() looks like: graph(%self : ClassType<TestModel>, %input : Float(5, 3, 2)): %11 : int = prim::Constant[value=5]() %12 : int = prim::Constant[value=4]() %13 : int = prim::Constant[value=0]() %14 : Device = prim::Constant[value="cpu"]() %15 : bool = prim::Constant[value=0]() %16 : Long(5) = aten::arange(%11, %12, %13, %14, %15) return (%16) This PR will allow the trace to have a variable value for %11. The trace of TestModel() with this PR's modifs looks like: graph(%self : ClassType<TestModel>, %input : Float(5, 3, 2)): %2 : int = prim::Constant[value=0]() %3 : int = aten::size(%input, %2) %4 : Long() = prim::NumToTensor(%3) %11 : Scalar = prim::ImplicitTensorToNum(%4) %12 : int = prim::Constant[value=4]() %13 : int = prim::Constant[value=0]() %14 : Device = prim::Constant[value="cpu"]() %15 : bool = prim::Constant[value=0]() %16 : Long(5) = aten::arange(%11, %12, %13, %14, %15) return (%16) More info : https://github.com/pytorch/pytorch/issues/20075 Pull Request resolved: https://github.com/pytorch/pytorch/pull/25363 Reviewed By: zrphercule Differential Revision: D17301934 Pulled By: houseroad fbshipit-source-id: d9907763742cb51d8c761bf63fc2e4918f7b9941	2019-09-11 13:39:54 -07:00
Brian Vaughan	88e4cee3e7	Improve handling of mixed-type tensor operations (#22273 ) Summary: Improve handling of mixed-type tensor operations. This PR affects the arithmetic (add, sub, mul, and div) operators implemented via TensorIterator (so dense but not sparse tensor ops). For these operators, we will now promote to reasonable types where possible, following the rules defined in https://github.com/pytorch/pytorch/issues/9515, and error in cases where the cast would require floating point -> integral or non-boolean to boolean downcasts. The details of the promotion rules are described here: https://github.com/nairbv/pytorch/blob/promote_types_strict/docs/source/tensor_attributes.rst Some specific backwards incompatible examples: * now `int_tensor * float` will result in a float tensor, whereas previously the floating point operand was first cast to an int. Previously `torch.tensor(10) * 1.9` => `tensor(10)` because the 1.9 was downcast to `1`. Now the result will be the more intuitive `tensor(19)` * Now `int_tensor *= float` will error, since the floating point result of this operation can't be cast into the in-place integral type result. See more examples/detail in the original issue (https://github.com/pytorch/pytorch/issues/9515), in the above linked tensor_attributes.rst doc, or in the test_type_promotion.py tests added in this PR: https://github.com/nairbv/pytorch/blob/promote_types_strict/test/test_type_promotion.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/22273 Reviewed By: gchanan Differential Revision: D16582230 Pulled By: nairbv fbshipit-source-id: 4029cca891908cdbf4253e4513c617bba7306cb3	2019-09-05 18:26:09 -07:00
Edward Yang	3d9c419648	Port new_empty to ATen. (#25475 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25475 I got sucked into this rabbit hole when I was trying to understand what I should do with TensorTypeId occurrences in torch/csrc/utils/tensor_new.cpp. I eventually concluded that all of my problems were because Tensor.new_empty was hand implemented and not actually a native function. So I made it a native function. There are a bunch of other new_* functions which should get this treatment, but I'm sending out this PR just to show how it can be done. The general recipe: 1. Implement a concept of TensorOptions merging (TensorOptions::merge_in). This represents the notion of taking a tensor, but "overriding" some of its values with specific overrides. One subtlety here is how devices get merged; see the comments for what our existing behavior is, and how I preserve it. 2. Implement new_empty as a native function, using options merging. 3. Add another special case to Python binding generation to treat new_* similar to *_like (i.e., handle TensorOptions correctly). The logic here is probably wrong, actually; we should codegen TensorOptions correctly no matter what happens, but new_empty follows the same pattern as empty_like so I opted not to touch this code too much. 4. Delete the now defunct manual binding code. 5. Delete manual type annotations that are no longer necessary since we're going through native. I didn't handle memory format correctly here. I don't know if this function should accept memory format; prior memory format patches didn't add support for memory format to new_like. If we had put memory format in TensorOptions this wouldn't have been a question. ghstack-source-id: 89294185 Test Plan: sandcastle & ossci Differential Revision: D17133000 fbshipit-source-id: 00f4e98bd5174f6fd54e8aba2910ea91824771d9	2019-09-04 14:34:39 -07:00
Roy Li	9c8f9f0ecb	Remove many usages of Type (#21941 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21941 ghimport-source-id: f20cca6229daba9eb8652adb3d959266ae081ef1 Test Plan: Imported from OSS Differential Revision: D15893331 Pulled By: li-roy fbshipit-source-id: c988b16008ff0e2725a88c6025afd4aabdaca45a	2019-06-30 04:11:28 -07:00
Brennan Vincent	f4f32cecfd	numpy like nonzero (called nonzero_tuple) (#20293 ) Summary: No performance degradation compared to Numpy when indexing: ``` In [15]: x=torch.randn((1000,1000)) In [16]: %timeit x[x.nonzero_tuple()] 4.63 ms ± 102 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) In [17]: y=x.numpy() In [18]: %timeit y[y.nonzero()] 14.6 ms ± 281 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) In [20]: x=x.t() In [22]: %timeit x[x.nonzero_tuple()] 9.01 ms ± 626 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) In [24]: y=x.numpy() In [25]: %timeit y[y.nonzero()] 16.8 ms ± 770 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/20293 Differential Revision: D15358754 Pulled By: umanwizard fbshipit-source-id: 1344aabd95c969eeda9780c475a39551231879e1	2019-06-06 12:50:59 -07:00
Edward Yang	73a97387c1	Replace AT_CHECK with TORCH_CHECK [shard 9/10] Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20435 Reviewed By: jerryzh168 Differential Revision: D15318877 fbshipit-source-id: 4d83571187ea14a604fef83ac355d328b46d93e1	2019-05-15 08:05:59 -07:00
Roy Li	689dd800ed	Generate only one Type class per backend (#19295 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19295 ghimport-source-id: 9345110f91f044a449804ddd5116cc9179444a00 Differential Revision: D14948581 Pulled By: li-roy fbshipit-source-id: a317b03d58d621e8df162918038f7543bfb13ba2	2019-04-21 21:16:14 -07:00
Vitaly Fedyunin	1c5073fb4b	Adding pin_memory kwarg to zeros, ones, empty, ... tensor constructors (#18952 ) Summary: Make it possible to construct a pinned memory tensor without creating a storage first and without calling pin_memory() function. It is also faster, as copy operation is unnecessary. Supported functions: ```python torch.rand_like(t, pin_memory=True) torch.randn_like(t, pin_memory=True) torch.empty_like(t, pin_memory=True) torch.full_like(t, 4, pin_memory=True) torch.zeros_like(t, pin_memory=True) torch.ones_like(t, pin_memory=True) torch.tensor([10,11], pin_memory=True) torch.randn(3, 5, pin_memory=True) torch.rand(3, pin_memory=True) torch.zeros(3, pin_memory=True) torch.randperm(3, pin_memory=True) torch.empty(6, pin_memory=True) torch.ones(6, pin_memory=True) torch.eye(6, pin_memory=True) torch.arange(3, 5, pin_memory=True) ``` Part of the bigger: `Remove Storage` plan. Now compatible with both torch scripts: ` _1 = torch.zeros([10], dtype=6, layout=0, device=torch.device("cpu"), pin_memory=False)` and ` _1 = torch.zeros([10], dtype=6, layout=0, device=torch.device("cpu"))` Same checked for all similar functions `rand_like`, `empty_like` and others It is fixed version of #18455 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18952 Differential Revision: D14801792 Pulled By: VitalyFedyunin fbshipit-source-id: 8dbc61078ff7a637d0ecdb95d4e98f704d5450ba	2019-04-16 11:06:15 -07:00
Vitaly Fedyunin	b7c830b916	Revert "Adding pin_memory kwarg to zeros, ones, empty,... (#18854 ) Summary: This reverts commit `c484cf43a0`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18854 Differential Revision: D14778393 Pulled By: VitalyFedyunin fbshipit-source-id: 4b5a1f5b1c091bbc4a8e75614734cc011d26b452	2019-04-05 06:25:33 -07:00
Roy Li	d70c6f23f4	Pass ScalarType separately from Type in python constructors Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17786 Reviewed By: ezyang Differential Revision: D14379075 fbshipit-source-id: 3abf066563b789a30cafe5b0c868a41326f5b833	2019-04-04 02:24:20 -07:00
Roy Li	c705d9eb1e	Introduce DeprecatedTypeProperties class (#17991 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17991 changes: -Breaks bc: Tensor::type() now returns DeprecatedTypeProperties& rather than Type&. -Added DeprecatedTypeProperties, it serves as a temporary replacement for Type as the return value of Tensor::type(). This contributes to making Type just for dispatch purposes so that we can make it dtype agnostic. -Tensor::dispatch_type() now returns Type& like Tensor::type() used to do. -Changed callsites of Tensor::type() appropriately. Reviewed By: ezyang Differential Revision: D14443117 fbshipit-source-id: 239ccb7a09626279a71d1a37f8f82e7f57bf7d9e	2019-04-04 02:24:13 -07:00
Vitaly Fedyunin	c484cf43a0	Adding pin_memory kwarg to zeros, ones, empty, ... tensor constructors. (#18455 ) Summary: Make it possible to construct a pinned memory tensor without creating a storage first and without calling pin_memory() function. It is also faster, as copy operation is unnecessary. Supported functions: ```python torch.rand_like(t, pin_memory=True) torch.randn_like(t, pin_memory=True) torch.empty_like(t, pin_memory=True) torch.full_like(t, 4, pin_memory=True) torch.zeros_like(t, pin_memory=True) torch.ones_like(t, pin_memory=True) torch.tensor([10,11], pin_memory=True) torch.randn(3, 5, pin_memory=True) torch.rand(3, pin_memory=True) torch.zeros(3, pin_memory=True) torch.randperm(3, pin_memory=True) torch.empty(6, pin_memory=True) torch.ones(6, pin_memory=True) torch.eye(6, pin_memory=True) torch.arange(3, 5, pin_memory=True) ``` Part of the bigger: `Remove Storage` plan. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18455 Reviewed By: ezyang Differential Revision: D14672084 Pulled By: VitalyFedyunin fbshipit-source-id: 9d0997ec00f59500ee018f8b851934d334012124	2019-04-02 08:48:19 -07:00
Roy Li	7aae51cded	Replace tensor.type().scalarType() calls with tensor.scalar_type() Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17515 Reviewed By: ezyang Differential Revision: D14233250 fbshipit-source-id: 6c7af8d2291c0c2b148001b30cf03834f34366c0	2019-03-08 14:08:18 -08:00
Xiang Gao	2e5a8cee82	Customize the printing of namedtuple return (#17136 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/17112 ```python print("good", torch.randn(5,5,5).max(1)) print("terrible", torch.randn(5,5,10).max(1)) print("not as good", torch.randn(5,5,500).max(1)) print ("old behaviour = gold standard") print(tuple(torch.randn(5,5,5).max(1))) print(tuple(torch.randn(5,5,10).max(1))) print(tuple(torch.randn(5,5,500).max(1))) ``` now gives ``` >>> import torch >>> print("good", torch.randn(5,5,5).max(1)) good torch.return_types.max( values=tensor([[ 1.2821, 1.8063, 1.8075, 1.3082, -0.1267], [ 0.3437, 0.7353, 1.2619, 0.7557, 1.6662], [ 0.8583, 1.8906, 1.0246, 1.7598, 1.1184], [ 1.7821, 0.0230, 0.9452, 1.0318, 1.0823], [ 0.4116, -0.0379, -0.1843, 1.4129, 1.8796]]), indices=tensor([[4, 4, 3, 2, 1], [1, 2, 4, 1, 1], [2, 4, 0, 2, 1], [0, 2, 0, 3, 1], [0, 4, 4, 4, 4]])) >>> print("terrible", torch.randn(5,5,10).max(1)) terrible torch.return_types.max( values=tensor([[ 2.1272, 1.3664, 2.2067, 1.3974, -0.0883, 1.2505, 1.0074, 1.1217, 0.3849, 0.6936], [ 0.6288, -0.4560, 1.2748, 1.5482, 1.2777, 1.6874, 0.7151, 0.6041, 1.3572, 1.6232], [ 1.6703, 1.0075, 1.6480, 2.2839, 1.3390, 0.4938, 1.6449, 1.7628, 0.8141, 2.5714], [ 0.7079, 1.8677, 3.2478, 1.5591, 2.4870, 0.8635, -0.1450, 1.6923, 1.4924, 1.6298], [ 2.4056, 0.8002, 0.9317, 0.7455, 0.7866, 2.1191, 0.3492, 1.2095, 1.8637, 1.7470]]), indices=tensor([[1, 1, 0, 0, 0, 0, 3, 4, 4, 4], [4, 2, 2, 1, 2, 2, 3, 1, 1, 3], [0, 3, 3, 0, 2, 1, 4, 1, 0, 1], [4, 1, 3, 0, 3, 2, 0, 1, 4, 3], [1, 0, 3, 2, 1, 0, 0, 1, 0, 1]])) >>> print("not as good", torch.randn(5,5,500).max(1)) not as good torch.return_types.max( values=tensor([[ 0.3877, 0.7873, 1.8701, ..., 0.5971, 1.6103, -0.3435], [ 1.1300, 2.2418, 1.4239, ..., 1.3943, 0.3872, 1.6475], [ 2.0656, 1.3136, 0.9896, ..., 2.3918, 0.8226, 1.0517], [ 1.1054, 0.9945, 1.0561, ..., 2.1039, 1.1524, 3.0304], [ 1.5041, 2.2809, 1.0883, ..., 0.8504, 2.4774, 1.1041]]), indices=tensor([[4, 3, 1, ..., 1, 4, 0], [4, 4, 4, ..., 3, 0, 3], [3, 0, 1, ..., 2, 2, 4], [0, 1, 1, ..., 4, 2, 2], [1, 0, 4, ..., 2, 0, 2]])) >>> print ("old behaviour = gold standard") old behaviour = gold standard >>> print(tuple(torch.randn(5,5,5).max(1))) (tensor([[ 1.1908, 1.1807, 1.3151, 1.7184, 0.3556], [ 0.3798, 0.9213, 0.3001, 1.3087, 2.2419], [ 1.4233, 1.4814, 1.9900, 1.7744, 1.3059], [ 1.0026, -0.0330, 1.3061, 1.8730, 2.0685], [ 1.3041, 1.6458, 1.3449, 1.8948, 3.6206]]), tensor([[0, 4, 3, 4, 0], [1, 1, 4, 0, 4], [4, 1, 0, 3, 3], [1, 2, 1, 4, 0], [3, 3, 0, 3, 3]])) >>> print(tuple(torch.randn(5,5,10).max(1))) (tensor([[-0.1232, 0.8275, 0.6732, 1.1223, 0.8247, 1.2851, 1.6009, 1.9979, 1.9109, 0.7313], [ 0.2260, 0.5922, 1.6928, 0.6024, 2.1158, 3.0619, 0.5653, 0.7426, 0.8316, 0.6346], [ 0.4319, 0.2231, 0.5255, 1.7620, 1.1657, 0.8875, 0.5782, 0.6506, 0.5032, 1.7097], [ 0.4137, 1.7265, 1.4260, 2.0301, 1.2244, 0.7128, 2.6345, 0.7230, 1.3553, 1.6508], [ 1.0684, 1.7195, 1.4068, 0.7076, -0.0242, 0.8474, 0.8754, 1.7108, 0.2188, 1.1584]]), tensor([[0, 1, 3, 4, 2, 3, 4, 2, 1, 0], [1, 4, 0, 0, 3, 2, 0, 0, 3, 3], [2, 3, 1, 1, 4, 0, 1, 4, 4, 4], [0, 4, 1, 3, 2, 0, 2, 0, 3, 1], [1, 0, 0, 0, 0, 3, 3, 3, 2, 0]])) >>> print(tuple(torch.randn(5,5,500).max(1))) (tensor([[0.9395, 1.5572, 1.8797, ..., 2.0494, 0.8202, 0.9623], [1.7937, 0.7225, 1.8836, ..., 0.7927, 1.4976, 1.1813], [0.8558, 1.6943, 1.4192, ..., 0.8327, 1.9661, 0.4197], [1.2993, 1.4995, 0.9357, ..., 0.7810, 1.3030, 2.6216], [1.4206, 1.8315, 1.0338, ..., 1.4312, 1.3198, 1.5233]]), tensor([[0, 4, 3, ..., 3, 0, 2], [0, 1, 0, ..., 0, 4, 3], [3, 4, 3, ..., 3, 0, 0], [3, 2, 3, ..., 1, 2, 1], [1, 2, 4, ..., 3, 1, 3]])) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/17136 Differential Revision: D14250021 Pulled By: VitalyFedyunin fbshipit-source-id: aae72f03b35980063b1ac1f07b8353eddb0c8b93	2019-02-28 13:07:26 -08:00
Edward Yang	4404762d7d	Rename IntList to IntArrayRef. (#16751 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16751 This was made more complicated by the fact that ivalue::IntList is a thing. So I had to fix all of the sites where we referring to IValue post facto. The following codemods were run, in this order: ``` codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in IntList IntArrayRef codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in IntArrayRef::create IntList::create codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in ivalue::IntArrayRef ivalue::IntList codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in Tag::IntArrayRef Tag::IntList codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in isIntArrayRef isIntList codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in toIntArrayRef toIntList codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in 'Shared<IntArrayRef>' 'Shared<IntList>' codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in 'intrusive_ptr<IntArrayRef>' 'intrusive_ptr<IntList>' ``` Some manual fixups were done afterwards; they can be reviewed separately at https://github.com/pytorch/pytorch/pull/16752 Reviewed By: dzhulgakov Differential Revision: D13954363 fbshipit-source-id: b5c40aacba042402155a2f5a229fa6db7992ac64	2019-02-05 14:54:34 -08:00
Edward Yang	e35418b3be	New implementations of DeviceGuard, StreamGuard and MultiStreamGuard (with CUDA specializations) (#13342 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13342 This PR introduces a few new concepts: - DeviceGuardImplInterface, and implementations for CPU and CUDA, which provide a generic interface for interfacing with device and stream state, without requiring a direct dependency on the code in question. - InlineDeviceGuard, a general template for generating both specialized and dynamically dispatched device guard implementations. Dynamic dispatch is done by specializing it on a VirtualGuardImpl. - Provide a device-independent DeviceGuard class, which can be used even from CPU code. It uses the aforementioned dynamic dispatch. - CUDA-specialized CUDAGuard class, which doesn't have a dynamic dispatch but can only be used from CUDA. - StreamGuard, which is the same as above, but for streams rather than devices. - Optional variants of all the aforementioned guards, which are a no-op if no device/stream is specified - CUDAMultiStreamGuard, specifically for the case when we want to set a device on every guard. There are some subtle semantic changes, which have been thoroughly documented in the class definition. BC-breaking changes: - Move constructor/assignment have been removed from all device guard implementations. - In some cases where you previously wrote 'set_device' (or 'set_stream'), you now must write 'reset_device', because if you switch devices/device types, the stream/device on the previous device is unset. This is different from previous behavior. - CUDAGuard no longer handles streams, or multiple streams. Use CUDAStreamGuard or CUDAMultiStreamGuard as appropriate for your use case. Reviewed By: dzhulgakov Differential Revision: D12849620 fbshipit-source-id: f61956256f0b12be754b3234fcc73c2abc1be04e	2018-11-11 12:11:10 -08:00
Richard Zou	efab8e8fdf	Speed up tensor.get_device(), is_cuda(), is_sparse() by avoiding dispatches (#12841 ) Summary: `tensor.get_device()` went through two dispatches: once to the native function `get_device()`, and another when `get_device` calls `_th_get_device()`. This PR avoids the dispatch by directly implementing the `get_device` function as a method on Tensor. Future Work: - Investigate caching Device on TensorImpl. This will probably bring the tensor.get_device down to 2ns, but I'm not sure it's worth it. before: ``` ------------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------------ BM_TensorTypeId 0 ns 0 ns 1000000000 BM_TensorType 8 ns 8 ns 89407911 BM_TensorIsCuda 24 ns 24 ns 29313017 BM_TensorIsSparse 27 ns 27 ns 26083160 BM_TensorTypeIsCuda 11 ns 11 ns 65128120 BM_TensorNumel 11 ns 11 ns 68314492 BM_TensorGetDevice 71 ns 71 ns 9633125 BM_DeviceGuardCtor 173 ns 173 ns 4067173 BM_DeviceGuard 232 ns 232 ns `3009690` ``` after: ``` ------------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------------ BM_TensorTypeId 0 ns 0 ns 1000000000 BM_TensorType 10 ns 10 ns 69803872 BM_TensorIsCuda 2 ns 2 ns 321626683 BM_TensorIsSparse 6 ns 6 ns 177045382 BM_TensorNumel 12 ns 12 ns 58770533 BM_TensorGetDevice 4 ns 4 ns 128113396 BM_DeviceGuardCtor 52 ns 52 ns 14997278 BM_DeviceGuard 158 ns 158 ns 5767248 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/12841 Differential Revision: D10489353 Pulled By: zou3519 fbshipit-source-id: a596bc77352f21d5d35433c6de02c2f65aab5f9e	2018-10-25 19:57:52 -07:00
Edward Yang	72822ee6b2	Fix #11430 (CPU only builds raise opaque error message when calling .… (#11533 ) Summary: …cuda()) While I was at it, I audited all other ways I know how we might get a CUDA type from PyTorch and fixed more constructors which don't work. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/11533 Differential Revision: D9775786 Pulled By: ezyang fbshipit-source-id: cd07cdd375fdf74945539ec475a48bf08cbc0c17	2018-09-14 09:10:08 -07:00
Adam Paszke	90e31f4896	Improve tracer warnings (#11545 ) Summary: Also, fix a performance bug in `ensureUnique`. Previously it formatted the warning string even though we weren't tracing, so all that work would always happen in the hot path and be for nothing. A sample of how the new warnings look like: ``` tmp.py:4: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Pytho n values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! int(x) tmp.py:5: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this fun ction to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might caus e the trace to be incorrect. torch.tensor([1.]) tmp.py:6: TracerWarning: There are 2 live references to the data region being modified when tracing in-place operator add_. This might cause t he trace to be incorrect, because all other views that also reference this data will not not reflect this change in the trace! On the other ha nd, if all other views use the same memory, but are disjoint (e.g. are outputs of torch.split), this might still be safe. torch.split(y, 2, dim=1)[0].add_(2) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/11545 Differential Revision: D9782975 Pulled By: apaszke fbshipit-source-id: 5b3abd31366e59c69e0b7ff278042b5563deb5a9	2018-09-11 22:10:32 -07:00
Adam Paszke	3e665cc29b	Improve support for tracing sizes, add more tracer warnings (#11288 ) Summary: Many constructors like `torch.zeros` or `torch.randn` didn't support size tracing correctly which is fixed by this pass. Same issue has been fixed in legacy tensor constructors. Additionally, new tensor constructors, which do not participate in tracing (most notably `torch.tensor`, `torch.as_tensor` and `torch.from_numpy`) raise a warning when they are used. Finally, entering a traceable operation disables the tracing in its body. This is needed because zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/11288 Reviewed By: ezyang Differential Revision: D9751183 Pulled By: apaszke fbshipit-source-id: 51444a39d76a3e164adc396c432fd5ee3c8d5f7f	2018-09-10 15:22:48 -07:00
Tongzhou Wang	b9b9ae935b	Make torch.randint have default dtype int64 (#11040 ) Summary: cc gchanan apaszke Pull Request resolved: https://github.com/pytorch/pytorch/pull/11040 Differential Revision: D9565728 Pulled By: SsnL fbshipit-source-id: eb5be9609f30c88f52746fa7e13ad71e2856648e	2018-09-08 07:55:06 -07:00
Edward Yang	b02b125d16	Rename getMaybeVariableType back to getType. (#11250 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11250 ``` codemod -d . --extensions cc,cpp,cu,cuh,h getMaybeVariableType getType ``` Reviewed By: gchanan Differential Revision: D9648830 fbshipit-source-id: 6b2ac2b1c265ae47722390e6e7f106653077d851	2018-09-07 08:11:50 -07:00
Edward Yang	2c5ae8c4bf	Get rid of type() method on TensorOptions; use at::getType instead (#11023 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11023 I'd like TensorOptions to not know anything about Context, so I can move it to ATen/core without pulling in Context. To do this, the type() method has to go, since it consults the context to get a Type. Reviewed By: cpuhrsch Differential Revision: D9562467 fbshipit-source-id: 61a18a76eb042a5e70b64b963501e9d68c25d4f0	2018-08-31 14:27:05 -07:00
Edward Yang	750ede7215	Rename getType to getVariableTypeFromBaseType / getVariableType (#11095 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11095 We used getType to mean a lot of things. - getVariableTypeFromBaseType: given a base Type (non-Variable type) compute the Variable Type which corresponds to it. - getVariableType: like at::getType, but return the Variable type rather than the plain type. This rename makes it clearer at the use-site what things are what, and will make a subsequent rename of at::getType easier. Reviewed By: gchanan, cpuhrsch Differential Revision: D9583630 fbshipit-source-id: 2667ec98e7607bc466920c7415a8c651fd56dfca	2018-08-30 20:11:25 -07:00
Edward Yang	f7b02b3a68	Change Tensor/TensorImpl to use c10::intrusive_ptr (#10824 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10824 API additions: - Tensor(c10::intrusive_ptr<TensorImpl,UndefinedTensor>&&) - Tensor(const c10::intrusive_ptr<TensorImpl,UndefinedTensor>&) - Tensor::operator=(Tensor&&) && (for completeness sake) - TensorBase::unsafeGetTensorImpl() - TensorBase::unsafeReleaseTensorImpl() - TensorBase::getIntrusivePtr() - TensorImpl::type_id() - Tensor::set_data() - Tensor::is_same(Tensor) - Tensor::use_count() - Tensor::type_id() - Tensor::scalar_type() - WeakTensor::is_same(WeakTensor) - intrusive_ptr::weak_use_count() - weak_intrusive_ptr::weak_use_count() - c10::raw::intrusive_ptr::{incref,decref,make_weak} - c10::raw::weak_intrusive_ptr::{incref,decref,lock} API changes: - Tensor::pImpl is no longer public (and now named tensor_impl_) - Most methods accessed this way are now accessible on Tensor maybe_zero_dim() and set_wrapped_number() being prominent exceptions (they are now accessed through unsafeGetTensorImpl()) - Type is no longer friend of Tensor - TensorBase::reset(TensorImpl) is deleted - TensorBase::reset(TensorImpl, bool should_retain) is deleted - TensorBase::swap(TensorBaseImpl&) is deleted; use std::swap instead - TensorBase::get() is deleted; use unsafeGetTensorImpl() instead - TensorBase::detach() is deleted; use unsafeReleaseTensorImpl() instead - TensorBase::retain() is deleted; use _raw_incref() instead - TensorBase::release() is deleted; use _raw_decref() instead - WeakTensor lost most of its methods (it no longer inherits from TensorBase) - TensorImpl::storage() is now a const method - Tensor(TensorBase) constructor removed, instead we go through getIntrusivePtr(). I'm not sure about this change; I happened to have accidentally removed the TensorBase constructor and decided to fix call sites, but I could go the other way. - detail::set_data() is deleted; use Tensor::set_data() instead - c10::raw_intrusive_ptr_target removed; use the functions in c10::raw instead. (The reason for this change, is that it is invalid to cast an intrusive_ptr_target* to a raw_intrusive_ptr_target* to take advantage of the methods. But there is no reason the incref/decref methods shouldn't also work on intrusive_ptr_target; it is primarily an API consideration. We can be more standards compliant by keeping them as functions, which are universally applicable.) - intrusive_ptr::reclaim() and weak_intrusive_ptr::reclaim() now work on pointers of the NullType. (This counts as a bug fix, because the documentation specified that pointers produced by release() are valid to reclaim(), and a release() on a null intrusive_ptr produces the NullType::singleton()) Bug fixes: - Dispatch code for mutable references incorrectly returned a reference to a value argument (which would immediately go out of scope). They now correctly return a tensor by value. - intrusive_ptr copy/move assignment did not work correctly when an object was assigned to itself. We now check for this case and no-op if so. (This bug manifested itself as a Tensor mysteriously becoming an UndefinedTensor after lines of code like 'x = x.mul_(y)') Other changes: - The checked cast functions in Utils.h have now been renamed and detemplatized into checked unwrap functions. - Added type_id() and scalar_type() methods to Tensor - pImpl is no longer public - Documented what the && overloads are doing - All occurrences of 'new TensorImpl' (and similar spellings, like 'new THTensor') have been expunged. This is NO LONGER a valid way to create a new tensor, and if you do this, upon your first incref, you will catch an ASSERT failure saying that only tensors created by intrusive_ptr::release() are valid to reclaim(). Use c10::make_intrusive instead in this situation. - IValue is adjusted to use intrusive_ptr instead of Retainable, and all other sub-classes of Retainable were modified to use intrusive_ptr. When doing this, I had to make the constructors of sub-classes like ConstantList public, so that c10::make_intrusive could invoke them. Fortunately, if you incorrectly stack allocate a ConstantList, and then try to get an intrusive_ptr to it, it will fail, as stack allocated ConstantLists have refcount 0. - IValue very narrowly sidesteps the problem of handling NullType, as it considers intrusive_ptr<TensorImpl> identical to intrusive_ptr<TensorImpl, UndefinedTensor> which is not always true. This was always the case, but there's now a comment explaining what's going on. Some MSVC bugs were uncovered during the preparation of this patch. They are documented as comments in the code. Reviewed By: gchanan Differential Revision: D9481140 fbshipit-source-id: 14a8ea0c231ed88b5715fb86d92730926f9f92fc	2018-08-27 16:11:01 -07:00
Gregory Chanan	c3271b53e4	Remove ability of Scalars to hold Tensors. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10889 Differential Revision: D9512589 Pulled By: gchanan fbshipit-source-id: 8b2b26c9f3a4da31a46f684793ab237e9ef9a323	2018-08-27 07:26:14 -07:00
Wanchao Liang	47c1badf90	Fix the clamp special case and gradient problem on None, add None to JIT (#9596 ) Summary: Supersedes #8925 This PR fixes #8502, it fixes the gradients problem for clamp when passing None to the function, and add support for the NoneLiteral and NoneType in script to enable clamp tests. Now we could have corner cases like: ```python torch.jit.script def func(): x = torch.randn(3, 3, requires_grad=True) y = torch.clamp(x, None, 0) # max = 0 y = torch.clamp(x, min=None, max=0) ``` In both JIT and Aten, we use Scalar(NAN) as a sentinel value when passing None type to function clamp, this is the current way we used to support None type in JIT and to solve the gradient problem when user explicitly passing None into clamp. In JIT side, we create a tensor(NAN) and undefinedTensor if we encounter None when matching the function schema, and later in the interpreter, it will translate to Scalar(NAN) if needed. Ideally we don't need clamp_min and clamp_max in ATenNative/Autograd and could only support clamp after this change, but since bunch of other operators (e.g. Activation.cpp, Loss.cpp) is using clamp_min in several places, we will still have the functions available, but all python invocations will only call clamp instead of clamp_min/max (with calling underlying th_max/th_min in clamp). zdevito jamesr66a Pull Request resolved: https://github.com/pytorch/pytorch/pull/9596 Reviewed By: zdevito Differential Revision: D8940839 Pulled By: wanchaol fbshipit-source-id: c543a867b82e0ab8c99384773b173fdde2605d28	2018-07-27 22:54:33 -07:00
Peter Goldsborough	372d1d6735	Create ATen tensors via TensorOptions (#7869 ) * Created TensorOptions Storing the type in TensorOptions to solve the Variable problem Created convenience creation functions for TensorOptions and added tests Converted zeros to TensorOptions Converted rand to TensorOptions Fix codegen for TensorOptions and multiple arguments Put TensorOptions convenience functions into torch namespace too All factory functions except _like support TensorOptions Integrated with recent JIT changes Support _like functions Fix in place modification Some cleanups and fixes Support sparse_coo_tensor Fix bug in Type.cpp Fix .empty calls in C++ API Fix bug in Type.cpp Trying to fix device placement Make AutoGPU CPU compatible Remove some auto_gpu.h uses Fixing some headers Fix some remaining CUDA/AutoGPU issues Fix some AutoGPU uses Fixes to dispatch_tensor_conversion Reset version of new variables to zero Implemented parsing device strings Random fixes to tests Self review cleanups flake8 Undo changes to variable.{h,cpp} because they fail on gcc7.2 Add [cuda] tag to tensor_options_cuda.cpp Move AutoGPU::set_index_from into .cpp file because Windows is stupid and sucks Fix linker error in AutoGPU.cpp Fix bad merge conflict in native_functions.yaml Fixed caffe2/contrib/aten Fix new window functions added to TensorFactories.cpp * Removed torch::TensorOptions Added code to generate wrapper functions for factory methods Add implicit constructor from Backend to TensorOptions Remove Var() from C++ API and use torch:: functions Use torch:: functions more subtly in C++ API Make AutoGPU::set_device more exception safe Check status directly in DynamicCUDAHooksInterface Rename AutoGPU to DeviceGuard Removed set_requires_grad from python_variables.h and warn appropriately in Variable::set_requires_grad remove python_default_init: self.type() Add back original factory functions, but with deprecation warnings Disable DeviceGuard for a couple functions in ATen Remove print statement Fix DeviceGuard construction from undefined tensor Fixing CUDA device compiler issues Moved as many methods as possible into header files Dont generate python functions for deprecated factories Remove merge conflict artefact Fix tensor_options_cuda.cpp Fix set_requires_grad not being checked Fix tensor_new.h TEMPORARILY put some methods in .cpp files to see if it solves issues on windows and mac Fix bug in DeviceGuard.h Missing includes TEMPORARILY moving a few more methods into .cpp to see if it fixes windows Fixing linker errors * Fix up SummaryOps to use new factories Undo device agnostic behavior of DeviceGuard Use -1 instead of optional for default device index Also move DeviceGuard methods into header Fixes around device index after optional -> int32_t switch Fix use of DeviceGuard in new_with_tensor_copy Fix tensor_options.cpp * Fix Type::copy( * Remove test_non_float_params from ONNX tests * Set requires_grad=False in ONNX tests that use ints * Put layout/dtype/device on Tensor * Post merge fixes * Change behavior of DeviceGuard to match AutoGPU * Fix C++ API integration tests * Fix flip functions	2018-06-16 00:40:35 -07:00
gchanan	7abdc303c6	Don't allow requires_grad to be set on integer Tensor constructors in… (#7185 ) * Don't allow requires_grad to be set on integer Tensor constructors in tensor_new. * Fix autograd test. * Fix test_distributions. * Fix test_jit. * Fix NN tests.	2018-05-18 19:45:10 +02:00
gchanan	681baa9254	Restore warning to torch.range. (#7194 ) Also, get rid of warning specification in Declarations.cwrap, which currently has no effect.	2018-05-02 21:53:00 -04:00
gchanan	8031da5479	Implement torch.as_tensor, similar to numpy.asarray. (#7109 ) * Implement torch.as_tensor, similar to numpy.asarray. torch.as_tensor behaves like torch.tensor except it avoids copies if possible; so also somewhat like tensor.new but without the size overloads. I didn't add a requires_grad field, because we haven't decided on the semantics such as as_param. * Remove requires_grad for doc.	2018-05-01 12:54:43 -04:00
Thomas Viehmann	8fbab83c2a	only Tensors of floating point dtype can require gradients (see #7021 ) (#7034 )	2018-04-30 10:20:00 +02:00
gchanan	a6bfa16c17	torch.arange: add numpy-style type inference. (#7016 ) * torch.arange: add numpy-style type inference. This is a backwards-compatibility breaking change. * Fix flake8. * Use at::optional. * Remove unneeded header files. * Use reference wrapper. * Update arange for test. * Address review comments.	2018-04-27 15:11:45 -04:00
gchanan	3d907ef78e	Consistently check 'out' variants against specified dtype/layout/device parameters. (#6973 ) We were previously doing this in the most common cases, but not consistently.	2018-04-25 22:46:42 -04:00
gchanan	749d51414a	Separate cuda-ness from dtype. (#6470 ) * Separate cuda-ness from dtype. There are no longer torch.cuda.int64, etc; only torch.int64 that correspond to at::ScalarType. At the python arg parser level, the corresponding ATen type is selected from the combination of (ScalarType, Layout, Device). There is also currently unused code in here for support ScalarType in native_functions; this will be used for specifying aggregate types on reduction functions. * Fix test_autograd. * Add defaults to randint_like. * Track is_cuda in py tensor types. * Fix test_sparse. * Fix multiprocessing. * Fix rnn. * Fix test_nn. * Fix flake8.	2018-04-12 14:05:44 -04:00
Zhou Chang	d0f395f744	[pytorch] Fix clamp is missing kwarg out (#6028 ) (#6418 ) torch.clamp is out from template code, add it manually, same with auto generated code.	2018-04-09 13:39:31 -04:00
gchanan	4c81282c33	Introduce torch.layout and split layout from dtypes. (#6145 ) * Introduce torch.layout and split layout from dtypes. Tensors (and tensor types) now have a 'layout' attribute that returns either 'torch.strided' or 'torch.sparse_coo'. Previously, dtypes were 1-to-1 with ATen types/PyTensorTypes; the impetus behind this decision was to make things easy in the common case (i.e. specifying a type in a factory function). But this doesn't really follow for sparity, which isn't a common case. It also doesn't properly represent the concept or a dtype, which in numpy are proper scalar types (i.e. roughly the type returned from indexing the last dimension of an n-d array). But this should be the same whether or not the tensor is represented via strides, sparsity, etc. This is accomplished by: 1) having the dtype of tensor return the (device-type, scalar-type) combination, i.e. torch.cuda.float32, so both torch.cuda.FloatTensor and torch.cuda.sparse.FloatTensor have the same dtype 2) Adding a layout parameter to python functions, where the combination of (dtype, layout) maps to an ATen type that is used for dispatch. * Formatting, make init throw python_error. * Fix cuda not enabled error message. * Fix test.	2018-04-02 14:07:50 -04:00
Peter Goldsborough	d42fcdbc96	Add source location information to error messages (#6059 )	2018-03-29 22:57:18 +02:00
gchanan	db53389761	Add numpy.array-like type inference to torch.tensor. (#5997 ) * Add numpy.array-like type inference to torch.tensor. * Temporary fix for int/double types. * Treat python floats as the default (scalar) dtype. * Also make 0-length sequences the default scalar type and add more tests. * Add type inference to sparse_coo_tensor. * Fix sparse test. * Remove allow_variables. * Check numpy platform bits. * Address review comments. * Make suggested changes to constraints. * More checking windows builds. * Fix test for windows.	2018-03-27 15:27:23 -04:00
gchanan	c474136ee1	[REDO] Add torch.sparse_coo_tensor factory. (#5781 ) * Add torch.sparse_coo_tensor factory. Notes: 1) I didn't add Tensor.new_sparse_coo_tensor; it didn't seem particularly useful, but it's easy to add 2) This doesn't do the type inference, i.e. torch.sparse_coo_tensor(indices=LongTensor, values=IntTensor) will return a sparse tensor corresponding to the default type rather than a sparse IntTensor. We can add type inference later when we add it to other factories. * Fix merge. * Use type_conversion function from python_variable_methods.	2018-03-16 13:58:02 -04:00
gchanan	6f5e869259	Add promoteTypes to ATen and torch._promote_types to python. (#5795 ) This isn't hooked up to anything yet, but is necessary for both scalar binary ops in ATen and tensor constructor type inference in PyTorch.	2018-03-15 11:02:28 -04:00
Soumith Chintala	e40425fd9b	Revert "Add torch.sparse_coo_tensor factory. (#5745 )" (#5780 ) This reverts commit `361baa5a48`.	2018-03-14 13:30:52 -04:00
gchanan	361baa5a48	Add torch.sparse_coo_tensor factory. (#5745 ) Notes: 1) I didn't add Tensor.new_sparse_coo_tensor; it didn't seem particularly useful, but it's easy to add 2) This doesn't do the type inference, i.e. torch.sparse_coo_tensor(indices=LongTensor, values=IntTensor) will return a sparse tensor corresponding to the default type rather than a sparse IntTensor. We can add type inference later when we add it to other factories.	2018-03-14 12:10:07 -04:00
gchanan	a3442f62bc	Support native namespace functions with type dispatch. (#5576 ) * Support native namespace functions with type dispatch. Use 'ones' as an example. Note this is a "halfway" solution; i.e. the call chain is: at::ones(shape, dtype) -> dtype.ones(shape, dtype) -> CPUFloatType.ones(shape, dtype) -> at::native::ones(shape, dtype) The "nicer" solution would probably be something like: at::ones(shape, dtype) -> dtype.ones(shape) -> CPUFloatType.ones(shape) -> at::native::ones(shape, this) * Fix type inference. * Fix test install. * Fix extensions. * Put dtype argument at the beginning. * Fix extension.cpp. * Fix rnn. * Move zeros in the same manner. * Fix cuda. * Change randn. * Change rand. * Change randperm. * Fix aten contrib. * Resize in randperm_out. * Implement eye. * Fix sparse zeros. * linspace, logspace. * arange. * range. * Remove type dispatch from gen_python_functions. * Properly generate maybe_init_cuda for type dispatch functions not named type. * Don't duplicate dtype, this parameters for native type dispatched functions. * Call VariableType factory methods from the base type so it gets version number 0. * Address review comments.	2018-03-09 10:52:53 -05:00
Edward Z. Yang	f064c5aa33	Expunge all occurrences of torch._C._VariableFunctions (#5525 ) Some of the call-sites now look a little hokey with this removed, saving that for another patch. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2018-03-02 12:19:44 -05:00
Sam Gross	ebd32f7bcd	Check that parsed_args contains enough space for all parameters (#5467 )	2018-02-28 14:34:04 -05:00
Sam Gross	48a3349c29	Delete dead Tensor code paths (#5417 ) This deletes most of the dead Tensor code paths, including the TensorMethods cwrap and generic/Tensor.cpp. This also moves the THNN.cwrap/.cpp generation to generate_code which can use ninja if installed.	2018-02-27 17:58:09 -05:00

1 2

58 Commits