Summary:
https://github.com/pytorch/pytorch/pull/40129 fixed the error responsible for the first revert, but exposed another error in the same test.
This PR is intended as the "master copy" for merge, and it runs on full CI.
Two other PRs (restricted to run on a small subset of CI) supporting debugging DDP failures/hangs with multiple devices per process (`test_c10d.py:DistributedDataParallelTest.test_grad_layout_1devicemodule_2replicaperprocess`).
- https://github.com/pytorch/pytorch/pull/40290 tries the test with purely rowmajor contiguous params on an untouched master. In other words https://github.com/pytorch/pytorch/pull/40290 contains none of this PR's diffs aside from the test itself.
- https://github.com/pytorch/pytorch/pull/40178, for comparison, tries the test with this PR's diffs.
Both fail the same way, indicating failure is unrelated to this PR's other diffs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40358
Differential Revision: D22165785
Pulled By: albanD
fbshipit-source-id: ac7cdd79af5c080ab74341671392dca8e717554e
Summary: NVIDIA's Apex is updating to no longer rely on this behavior, but we're reverting this Python2->Python3 update to unblock internal apex users.
Test Plan: Sandcaslte + OSS CI.
Reviewed By: ngimel
Differential Revision: D22146782
fbshipit-source-id: f9483d2cbf9dc3a469ad48a6c863edea3ae51070
Summary:
BC-breaking note:
If a user is using one of these dunders directly they will not longer be available. Users should update to Python3 compatible dunders.
Original PR note:
`__div__` (and `__idiv__` and `__rdiv__`) are no longer special dunders in Python3. This PR replaces them with the `__truediv__` (`__itrudediv__`, `__rtruediv__`) dunders, since we no longer support Python2.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39151
Differential Revision: D22075713
Pulled By: mruberry
fbshipit-source-id: d318b47b51f7cc4c3728b1606a34d81e49ba0fa1
Summary:
Currently, whether `AccumulateGrad` [steals](67cb018462/torch/csrc/autograd/functions/accumulate_grad.h (L42)) or [clones](67cb018462/torch/csrc/autograd/functions/accumulate_grad.h (L80)) an incoming gradient, the gradient ends up rowmajor contiguous, regardless of its param's layout. If the param's layout is channels last, or otherwise not rowmajor contigous, later kernels that apply gradients to params are forced into an uncoalesced memory access pattern for either the param or the gradient. This may not sound like a big deal but for any binary op on large tensors it's a >3X increase in gmem traffic => 3X slowdown.
The present PR changes `AccumulateGrad` to prefer, where possible, stashing gradients that match their params' layouts (["Gradient Layout Contract"](https://github.com/pytorch/pytorch/pull/34904/files#diff-ef1a56d24f66b280dcdb401502d6a796R29-R38)).
Allowing `AccumulateGrad` to stash non-rowmajor-contiguous grads means DDP allreduces and DP reduces must allow non-rowmajor-contiguous grads. This PR extends DDP and DP to allow gradients with non-rowmajor-contiguous strides as long as their layout is nonoverlapping and dense.
For good measure, I include changes that allow all five nccl primitives (allreduce, reduce, broadcast, allgather, reducescatter) to act on non-rowmajor-contiguous tensors (again as long as each input's layout is nonoverlapping and dense, and as long as all tensors participating in a given collective have the same layout). The primitive comm changes aren't necessary to enable the DDP changes, but I wasn't sure this would end up true until I had written both sets of changes. I think primitive comm enablement is reasonable to keep in the PR, especially since the code for it is simple.
Channels last params will be a major beneficiary of this PR, but I don't see it as channels-last-specific fix. The spirit is layout matching in general:
- Grads should be stashed with memory layouts matching their params.
- Src and dst tensors on opposite ends of collectives should have matching dense layouts.
This PR also updates autograd docs to describe potential BC-breaking changes below.
## BC notes
ngimel albanD gchanan
#### BC-breaking
In the common case where the user lets AccumulateGrad decide grad layouts, strides for grads of dense but non-rowmajor-contiguous params will change. Any user code that was accustomed to `view(-1)`ing these grads will break.
Also, the circumstances under which a grad can be stolen directly from the backward function that created it, as opposed to deep-copied by AccumulateGrad, have changed. In most cases we expect silent performance improvement, because we expect channels-last-aware backward kernels will create channels last gradients for channels last params. Now those can be stolen, whereas before this PR they were cloned and made rowmajor contiguous. IMO this is a mild BC breakage. Param backward hooks still see grads come in with whatever format the backward kernel gave them. The only BC breakage potential I see is if user code relies somehow on a grad in a hook having or not having the same deep memory as the eventual `param.grad`. Any such users hopefully know they're off the edge of the map and understand how to update their expectations.
#### BC escape hatches
At alband's recommendation, this PR's changes to AccumulateGrad do not alter the pre-PR code's decisions about whether grad is accumulated in or out of place. Accumulations of new grads onto an existing `.grad` attribute were (usually) in-place before this PR and remain in-place after this PR, keeping the existing `.grad`'s layout. After this PR, if the user wants to force accumulation into a grad with a particular layout, they can preset `param.grad` to a zeroed tensor with the desired strides or call `grad.contiguous(desired format)`. This likely won't be as performant as letting AccumulateGrad establish grad layouts by cloning or stealing grads with contract-compliant strides, but at least users have a control point.
One limitation (present before this PR and unchanged by this PR): Presetting `param.grad` does not ensure in-place accumulation all the time. For example, if `create_graph=True`, or if incoming `new_grad` is dense and existing `variable_grad` is sparse, accumulation occurs out of place, and the out-of-place result may not match the existing grad's strides.
----------------------------
I also noticed some potential DDP improvements that I considered out of scope but want to mention for visibility:
1. make sure Reducer's ops sync with AccumulateGrad streams
2. ~to reduce CPU overhead and incur fewer kernel launches, lazily create flat `contents` tensors by a single `cat` kernel only when a bucket is full, instead of `copy_`ing grads into `contents` individually as soon as they are received.~ PR includes a [minor change](https://github.com/pytorch/pytorch/pull/34904/files#diff-c269190a925a4b0df49eda8a8f6c5bd3R312-R315) to divide grads while copying them into flat buffers, instead of copying them in, then dividing separately. Without cat+div fusion, div-while-copying is the best we can do.
3. https://github.com/pytorch/pytorch/issues/38942
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34904
Differential Revision: D20496044
Pulled By: albanD
fbshipit-source-id: 248d680f4b1bf77b0a986451844ec6e254469217
Summary:
Remove PY3 and PY34 checks from `torch/testing/_internal/common_utils.py`
Remove PY35 global var from `torch.jit.annotations`
Always call `try_get_real_signature` in `torch/jit/annotations.py`
Use `map` instead of `imap`, since Python-2 is no longer support, so map is always lazy.
Remove all pre Python-3.6 checks from `torch/_six.py` and `torch/_appdirs.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39879
Differential Revision: D22037811
Pulled By: malfet
fbshipit-source-id: af0c79f976569c2059d39ecb49c6b8285161734f
Summary:
These warning's goal is to show the user where to be careful in their code. So make them point to the user's code.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39143
Differential Revision: D21764201
Pulled By: albanD
fbshipit-source-id: f1369d1b0e71d93af892ad3b7b1b3030e6699c59
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37098
### **Cherry-picked from another stack:**
Some code review already occurred here: https://github.com/pytorch/pytorch/pull/32582
### Summary:
Fixes: https://github.com/pytorch/pytorch/issues/32436
The issue caused incorrect handling of dtypes for scalar ** tensor.
e.g. before this change:
```
>>> 5.5 ** torch.ones(5, dtype=torch.int32)
tensor([5, 5, 5, 5, 5], dtype=torch.int32)
```
should return a float tensor.
Also fixes a number of incorrect cases:
* tensors to negative powers were giving incorrect results (1 instead
of 0 or error)
* Behavior wasn't consistent between cuda/cpu
* large_value ** 1 in some cases gave a result not equal
to large_value because of truncation in conversion to double and back.
BC-breaking:
Previously incorrect behavior (in 1.4):
```
>>> a
tensor([1, 1, 1, 1, 1], dtype=torch.int32)
>>> a.pow_(.5)
tensor([1, 1, 1, 1, 1], dtype=torch.int32)
```
After this change:
`RuntimeError: result type Float can't be cast to the desired output type Int`
Test Plan: Imported from OSS
Differential Revision: D21686207
Pulled By: nairbv
fbshipit-source-id: e797e7b195d224fa46404f668bb714e312ea78ac
Summary:
This enables type checking for named tensors, and fixes the underlying problems.
The bulk of the fix is modifying `gen_pyi.py` to generate reasonable types in `torch/__init__.pyi`. I took two approaches: First, I tried to take a generic approach and added `DimnameList` to the magic list of variable argument lists. Unfortunately that was insufficient for many of the method signatures, so I also added manual definitions for `rename`, `refine_names`, and `unflatten` in `__init__.pyi.in`.
Finally there were a few problems in the doctests that had to be cleaned up so that `test/test_type_hints.py` will run successfully.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36890
Differential Revision: D21259192
Pulled By: zou3519
fbshipit-source-id: 2a9e7d7bec9be5ae3ae2995078c6abfa3eca103c
Summary:
Resolves https://github.com/pytorch/pytorch/issues/36730https://github.com/pytorch/pytorch/issues/36057
Partially resolves: https://github.com/pytorch/pytorch/issues/36671
```
>>> 2j / torch.tensor([4], dtype = torch.complex64)
tensor([(0.0000+0.5000j)], dtype=torch.complex64)
>>> 1 / torch.tensor(3+4j)
tensor((0.1200-0.1600j), dtype=torch.complex64)
```
rdiv is more generally broken for all dtypes because it doesn't promote the types properly
eg.
```
>>> 1 / torch.tensor(2)
tensor(0)
>>> 2j / torch.tensor(4)
tensor(0)
```
so that issue should be fixed in a separate PR
Adding CPU acc types for complex
Added cumsum, cumprod for complex dtypes
Added complex dtypes to get_all_math_dtypes to expand testing for complex dtypes
Old PR - https://github.com/pytorch/pytorch/pull/36747
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37193
Differential Revision: D21229373
Pulled By: anjali411
fbshipit-source-id: 8a086136d8c10dabe62358d276331e3f22bb2342
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35615
Python 2 has reached end-of-life and is no longer supported by PyTorch.
Now we can clean up a lot of cruft that we put in place to support it.
These changes were all done manually, and I skipped anything that seemed
like it would take more than a few seconds, so I think it makes sense to
review it manually as well (though using side-by-side view and ignoring
whitespace change might be helpful).
Test Plan: CI
Differential Revision: D20842886
Pulled By: dreiss
fbshipit-source-id: 8cad4e87c45895e7ce3938a88e61157a79504aed
Summary:
Resolves https://github.com/pytorch/pytorch/issues/36730https://github.com/pytorch/pytorch/issues/36057
Partially resolves: https://github.com/pytorch/pytorch/issues/36671
```
>>> 2j / torch.tensor([4], dtype = torch.complex64)
tensor([(0.0000+0.5000j)], dtype=torch.complex64)
>>> 1 / torch.tensor(3+4j)
tensor((0.1200-0.1600j), dtype=torch.complex64)
```
rdiv is more generally broken for all dtypes because it doesn't promote the types properly
eg.
```
>>> 1 / torch.tensor(2)
tensor(0)
>>> 2j / torch.tensor(4)
tensor(0)
```
so that issue should be fixed in a separate PR
Adding CPU acc types for complex
Added cumsum, cumprod for complex dtypes
Added complex dtypes to get_all_math_dtypes to expand testing for complex dtypes
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36747
Differential Revision: D21138687
Pulled By: anjali411
fbshipit-source-id: ad3602ccf86c70294a6e71e564cb0d46c393dfab
Summary:
The quantizer use std::vector to save per_channel scales and zero_points, but when query scales(zero_points), it requires to return tensor. These lead to use std::vector to initialize tensors and it dose cost lots of time. So I change quantizer to save per_channel scales and zero_points by using tensor directly.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31040
Differential Revision: D19701070
Pulled By: jerryzh168
fbshipit-source-id: 9043f16c44b74dd8289b8474e540171765a7f92a
Summary:
split requires an int input, however in tracing operators such as
size(axis) return a tensor, which is different behavior than when not
tracing. As such need to modify split to handle these cases.
Fixes https://github.com/pytorch/pytorch/issues/27551
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32493
Reviewed By: hl475
Differential Revision: D19538254
Pulled By: houseroad
fbshipit-source-id: c8623009de5926aa38685e08121f4b48604bd8c0
Summary:
Adds `torch.floor_divide` following the numpy's `floor_divide` api. I only implemented the out-of-place version, I can add the inplace version if requested.
Also fixes https://github.com/pytorch/pytorch/issues/27512
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30493
Differential Revision: D18896211
Pulled By: eellison
fbshipit-source-id: ee401c96ab23a62fc114ed3bb9791b8ec150ecbd
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30892
Fixes all outstanding lints and actually installs a properly configured
flake8
Test Plan: Imported from OSS
Differential Revision: D18862825
Pulled By: suo
fbshipit-source-id: 08e9083338a7309272e17bb803feaa42e348aa85
Summary:
When converting a contiguous CuPy ndarray to Tensor via `__cuda_array_interface__`, an error occurs due to incorrect handling of default strides. This PR fixes this problem. It makes `torch.tensor(cupy_ndarray)` works for contiguous inputs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24947
Differential Revision: D18838986
Pulled By: ezyang
fbshipit-source-id: 2d827578f54ea22836037fe9ea8735b99f2efb42
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27308
Currently, `tensor.align_to(*names)` has the restriction that the
`tensor` must be fully named. This doesn't need to be the case, when
using Ellipsis, we "expand the ellipsis to all unmentioned dimensions,
in the order which they appear in the original tensor".
For example, consider `tensor: Tensor[None, None, C]`.
`tensor.align_to(C, None, None)` is ambiguous because the user might
have wanted to switch the order of the None dimensions and there is no
way to specify that using this API. However, `tensor.align_to('C', ...)`
isn't ambiguous: we can select the two unnamed dimensions in the order
in which they appear.
To actually implement this, we write a brand-new `align_to(names,
ellipsis_idx)` function in c++ that is separate from the regular
`align_to(names)` implementation. Ideally we would support "..." as a
special name in c++ and combine the two implementations; we'll need to
support "..." in c++ in the future but that requires a bit of extra work.
In this PR, Python processees the ellipsis and then calls the correct
overload.
Test Plan: - run tests
Differential Revision: D17745179
Pulled By: zou3519
fbshipit-source-id: 9fed06d224215cfb7efecd8c002604baab3c45e6
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27173
`docs/source/named_tensor.rst` is the entry point; most users will land
either here or the named tensor tutorial when looking to use named
tensors. We should strive to make this as readable, concise, and understandable
as possible.
`docs/source/name_inference.rst` lists all of the name inference rules.
It should be clear but it's hard to make it concise.
Please let me know if anything doesn't make sense and please propose
alternative wordings and/or restructuring to improve the documentation.
This should ultimately get cherry-picked into the 1.3 branch as one
monolithic commit so it would be good to get all necessary changes made
in this PR and not have any follow ups.
Test Plan: - built and reviewed locally with `cd docs/ && make html`.
Differential Revision: D17763046
Pulled By: zou3519
fbshipit-source-id: c7872184fc4b189d405b18dad77cad6899ae1522
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27339
This PR just shows a warning message.
Eventually we will show a correct __dir__
Test Plan: Imported from OSS
Differential Revision: D17751333
Pulled By: zafartahirov
fbshipit-source-id: e9bc62fd8dd0147979291d0aac3f1afe5b8c7a9f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26675
Based on offline poll, we're very unlikely to have multi-axis quantized tensors in the foreseeable future. Let's simplify API and just return int instead of list. It also matches the singular `axis` name.
Test Plan: Imported from OSS
Differential Revision: D17537052
Pulled By: dzhulgakov
fbshipit-source-id: 676abc3b251d288468aaed467b5e5ca4063b98b0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26339
Serializes per-channel tensor in both torch.serialization and jit. Since we didn't bind Quantizer properly yet, I chose to save a tuple representing quantizer settings. To avoid recursive tensor serialization calls, I'm using tuple instead of tensor to store scales and zero points.
driazati - please check the serialization logic. Is there a good test that compares that JIT serialization and python serialization are equivalent? (I haven't tested it yet)
Test Plan: Imported from OSS
Differential Revision: D17443222
Pulled By: dzhulgakov
fbshipit-source-id: a34758de1ffd2ec1cdc5355f5baf95284a4ccf4b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26648
Previously:
- `Tensor.align_to(*names)` only works on fully named tensors. In addition, the
desired ordering `names` must not have any None-names.
- `Tensor.align_to(*names)` accepted `...`, but expanded it based on
position. i.e., in `tensor.align_to('N', ..., 'C', 'H')`, `...` expand
to `*tensor.names[1:-2]`. This is wildly incorrect: see the following
concrete example.
```
tensor = tensor.refine_names('N', 'C', 'H, 'W')
tensor.align_to('W', ...) # ... expands to 'C', 'H, 'W'
```
This PR changes it so that `...` in `tensor.align_to` grabs all
unmentioned dimensions from `tensor`, in the order that they appear.
`align_to` is the only function that takes ellipsis that requires this
change. This is because all other functions (`refine_to`) require their
list of names to work in a positional manner, but `align_to` lets the
user reorder dimensions.
This does not add very much overhead to `align_to`, as shown in the
following benchmark. However, in the future, we should resolve to make
these operations faster; align_to should be as fast as view but isn't
most likely due to Python overhead.
```
[ins] In [2]: import torch
...: named = torch.randn(3, 3, 3, 3, names=('N', 'C', 'H', 'W'))
...: unnamed = torch.randn(3, 3, 3, 3)
...: %timeit unnamed[:]
...: %timeit unnamed.view(-1)
...: %timeit named.align_to(...)
...: %timeit named.align_to('N', 'C', 'H', 'W')
31 µs ± 126 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
43.8 µs ± 146 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
69.6 µs ± 142 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
66.1 µs ± 1.13 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
```
Test Plan:
- new tests [namedtensor ci]
allows the user to transpose and permute dimensions.
Differential Revision: D17528207
Pulled By: zou3519
fbshipit-source-id: 4efc70329f84058c245202d0b267d0bc5ce42069
Summary:
Changelog:
- Remove `torch.gels` which was deprecated in v1.2.0
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26480
Test Plan: - No tests were changed and all callsites for `torch.gels` where modified to `torch.lstsq` when `torch.lstsq` was introduced
Differential Revision: D17527207
Pulled By: zou3519
fbshipit-source-id: 28e2fa3a3bf30eb6b9029bb5aab198c4d570a950
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26548
This makes the naming more consistent with PyTorch's API. The original
concern was that `tensor.rename` might make the operation seem like it
is in-place. However, we have many "verb" APIs: `tensor.add(other)`, for
example, doesn't add other to tensor in-place, but `tensor.add_(other)`
does.
`tensor.rename_` does exactly the same place as `tensor.rename`, but
in-place.
Test Plan: - [namedtensor ci]
Differential Revision: D17502021
Pulled By: zou3519
fbshipit-source-id: 6a5b93136a820075013cd1e30fb8fc6b9d77d7d9
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26350
Python 3 lets us use `...` to perform indexing. Semantically, `...`
means "the rest of the unspecified dimensions". For example, while
indexing, one can do (for 5D `tensor`) `tensor[0, 0, ..., 0]` and
the `...` is expanded into `tensor[0, 0, :, :, 0]`.
Previously, we were using '*' to represent a similar behavior in names.
For example, `tensor.refine_names` supports things like the following:
```
x = torch.randn(2, 3, 4, 5, 6)
x_out = x.refine_names('*', 'H', 'W') # refine only the last two
dimensions
```
This PR changes it so that named tensor API functions recognize `'...'`
(in Python 2 and Python 3) and `...` (in Python 3 exclusively) instead
of `'*'`.
Test Plan: - [namedtensor ci]
Differential Revision: D17424666
Pulled By: zou3519
fbshipit-source-id: 003182879fd38ced3fea051217572a457cdaf7cf
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26349
The directory holds a lot of private helper functions that help
implement named tensor functionality. Instead of naming each helper
function with a leading underscore, I change the name of the import to
`_namedtensor_internals` to signal it should not be used directly.
Test Plan: - [namedtensor ci]
Differential Revision: D17424178
Pulled By: zou3519
fbshipit-source-id: 8f7b74346765759303480e581038a661021acf53
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25658
This unflattens `dim` according to the shape specified in `namedshape`.
`namedshape` may be either an OrderedDict or an iterable of (name, size)
tuples.
Future:
- It is possible to make it take a dict in Python >= 3.6 because those are
ordered by default, but I'll leave that task for the future.
Test Plan: - new tests [namedtensor ci]
Differential Revision: D17192655
Pulled By: zou3519
fbshipit-source-id: fd9bd2f462c23a4df1c23d66f2aa95076ff1b160
Summary:
Because of 'return NotImplemented', __contains__ return True when the element is not a number.
bool(NotImplemented) == True
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24156
Differential Revision: D16829895
Pulled By: zou3519
fbshipit-source-id: 9d3d58025b2b78b33a26fdfcfa6029d0d049f11f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25843
`tensor.align_to(*names)` permutes the dimensions of `tensor` and adds
additional 1-sized dimensions such that the output tensor has dimensions
in the same order as `names`. All dimensions of `tensor` must be
present in `names`, in addition, this function requires that all dims of
`tensor` be named.
`tensor.align_as(other)` is equivalent to
`tensor.align_to(*other.names)`.
I'm planning on changing `torch.align_tensors(*tensors)` to align closer
to these semantics because there didn't seem to be a clear use case for the old
semantics that preserve unnamed dimensions. That will come in a future
change.
Test Plan: - new tests [namedtensor ci]
Differential Revision: D17255549
Pulled By: zou3519
fbshipit-source-id: 1e437ad81e9359b4d5bd0e7e64c3a1be441fc3e3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25842
`tensor.refine_names(*names)` takes `tensor` and attempts to name its
dimensions `names` out-of-place. If a dimension `i` already had a name,
then it cannot be changed (so tensor.names[i] must equal names[i]);
if the original dimension did not have a name, then the new name
(names[i]) can be anything.
`tensor.refine_names(*names)` also accepts a glob '*' that greedily selects
names from `tensor`. Here are some examples:
- `Tensor[None].refine_names('N') -> Tensor[N]`
- `Tensor[N].refine_names('N') -> Tensor[N]`
- `Tensor[N].refine_names('D') -> Error!`
- `Tensor[N].refine_names(None) -> Error!`
- `Tensor[None, None].refine_names('*', D) -> Tensor[None, D]`
Test Plan: - new tests [namedtensor ci]
Differential Revision: D17255548
Pulled By: zou3519
fbshipit-source-id: fdbdb3a12f24fbe37ce1e53ed09dc8a42589d928
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25711
This function renames the dimensions of a tensor out-of-place. Because
of that, I think `tensor.renamed(...)` is a clearer name: `view_names`
has the connotation that we can use names to `view` our tensors with a
"different shape", but what this function really does is let us rename a
tensor no matter the previous names.
`tensor.names_`, the in-place version of this, is unchanged for now.
However, we might delete this or not advertise it if it has no use case
and also because its naming is a little inconsistent with `tensor.renamed`.
Test Plan: - [namedtensor ci]
Differential Revision: D17206515
Pulled By: zou3519
fbshipit-source-id: 67053951fcc8130c84566b5ebbdce35ef619c90d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25345
Test Plan
- New tests [namedtensor ci]
Test Plan: Imported from OSS
Differential Revision: D17101486
Pulled By: zou3519
fbshipit-source-id: 58e803b042056ee6abab8551517f74078f2b81d5
Summary:
The semantic of the _auto-convert GPU arrays that support the __cuda_array_interface__ protocol_ has changed a bit.
It used to throw an exception when using `touch.as_tensor(...,device=D)` where `D` is a CUDA device not used in `__cuda_array_interface__`. Now, this is supported and will result in an implicit copy.
I do not what have changes but `from_blob()` now supports that the input and the output device differ.
I have updated the tests to reflect this, which fixes https://github.com/pytorch/pytorch/issues/24968
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25017
Differential Revision: D16986240
Pulled By: soumith
fbshipit-source-id: e6f7e2472365f924ca155ce006c8a9213f0743a7
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23973
Without loss of generality, I describe the API for `tensor.view_names`.
`tensor.names_` has an analogous API.
`tensor.view_names(*names)` returns a view on tensor with named dims `names`.
`names` must be of length `tensor.dim()`; otherwise, if '*' is in `names`,
then it (known as the "glob") is expanded greedily to be equal to the
corresponding names from `tensor.names`.
For example,
```
>>> x = torch.empty(2, 3, 5, 7, names=('N', 'C', 'H', 'W'))
>>> x.view_names('*', 'height', 'width').names
('N', 'C', 'height', 'width')
>>> x.view_names('batch', '*', 'width').names
('batch', 'C', 'H', 'width')
```
tensor.view_names(**rename_map) returns a view on tensor that has
renamed dims as specified in the mapping `rename_map`.
For example,
```
>>> x = torch.empty(2, 3, 5, 7, names=('N', 'C', 'H', 'W'))
>>> x.view_names(W='width', H='height').names
('N', 'C', 'height', 'width')
```
These are different(!!!) from the C++ API, which only allows the
following:
- tensor.view_names(optional<DimnameList>)
C++ API parity for named tensors is not important right now; I am
punting that to the future.
Test Plan: - [namedtensor ci]
Differential Revision: D16710916
Pulled By: zou3519
fbshipit-source-id: 7cb8056c0fb4c97b04c3a2d1dd0f737e0a67ce34
Summary:
Changelog:
- Rename `gels` to `lstsq`
- Fix all callsites
- Rename all tests
- Create a tentative alias for `lstsq` under the name `gels` and add a deprecation warning to not promote usage.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23460
Test Plan: - All tests should pass to confirm that the patch is correct
Differential Revision: D16547834
Pulled By: colesbury
fbshipit-source-id: b3bdb8f4c5d14c7716c3d9528e40324cc544e496
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21709
Change the return type from Scalar to double/int64_t so we don't need to do conversion when we call other quantize related aten functions
Differential Revision: D15793003
fbshipit-source-id: 510936c69fa17a4d67340a31ebb03415647feb04
Summary:
Added some extra tests for std_mean and var_mean for multiple dims.
Some refactoring of previously created tests based on PR comments: https://github.com/pytorch/pytorch/pull/18731
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20650
Differential Revision: D15396101
Pulled By: ifedan
fbshipit-source-id: d15c3c2c7084a24d6cfea4018173552fcc9c03a9