Summary:
Adds C++ API clip_grad_value_ for torch::nn:utils module.
Also, fix the for indent level error in the original test/test_nn.py.
Issue: https://github.com/pytorch/pytorch/issues/25883
Reviewer: yf225
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28736
Differential Revision: D18263807
Pulled By: yf225
fbshipit-source-id: 29282450bd2099df16925e1d0edd3d933f6eeb9b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28523
New features:
1. Previously, `torch::tensor({true, false, true})` throws `"tensor_cpu" not implemented for 'Bool'`. After this PR, it produces the correct bool tensor, matching the Python API behavior.
2. Tensors with zero-size dimensions are now supported, e.g. `torch::tensor({{}, {}})` produces a tensor with sizes `{2, 0}`, matching the Python API behavior.
BC-breaking bug fixes:
1. Previously, `torch::tensor({{1}, {2}})` produces a tensor of sizes `{2}`. After this PR, it produces a tensor of sizes `{2, 1}`, matching the Python API behavior.
2. Fixed semantics of `torch::tensor(1.1)`: it now returns a 0-dim tensor instead of a 1-dim tensor, matching the Python API behavior.
3. Previously, when passed a non-dtype `TensorOptions` to the `torch::tensor` constructor, it always produces a tensor of dtype `float`. After this PR, it produces tensor of different dtypes based on the dtype of the braced-init-list, matching the behavior of the no-options case.
```cpp
// Previously:
torch::tensor({1, 2, 3}, torch::TensorOptions(/*non-dtype-options*/)).dtype() -> float
torch::tensor({{1, 2, 3}}, torch::TensorOptions(/*non-dtype-options*/)).dtype() -> float
torch::tensor({1., 2., 3.}, torch::TensorOptions(/*non-dtype-options*/)).dtype() -> float
torch::tensor({{1., 2., 3.}}, torch::TensorOptions(/*non-dtype-options*/)).dtype() -> float
// Now:
torch::tensor({1, 2, 3}, torch::TensorOptions(/*non-dtype-options*/)).dtype() -> int
torch::tensor({{1, 2, 3}}, torch::TensorOptions(/*non-dtype-options*/)).dtype() -> int
torch::tensor({1., 2., 3.}, torch::TensorOptions(/*non-dtype-options*/)).dtype() -> double
torch::tensor({{1., 2., 3.}}, torch::TensorOptions(/*non-dtype-options*/)).dtype() -> double
// As comparison, currently:
torch::tensor({1, 2, 3}).dtype() -> int
torch::tensor({{1, 2, 3}}).dtype() -> int
torch::tensor({1., 2., 3.}).dtype() -> double
torch::tensor({{1., 2., 3.}}).dtype() -> double
```
Notes:
1. From now on, the behavior of `at::tensor(scalar_value)` (which produces a 1-dim tensor) would be different from `torch::tensor(scalar_value)` (which produces a 0-dim tensor). I will fix the behavior of `at::tensor(scalar_value)` in a follow-up PR.
2. From now on, the behavior of `at::tensor({1, 2, 3}, torch::TensorOptions(/*non-dtype-options*/))` (which produces a `float` tensor) would be different from `torch::tensor({1, 2, 3}, torch::TensorOptions(/*non-dtype-options*/))` (which produces a an `int` tensor). I will fix this behavior of `at::tensor` constructor in a follow-up PR.
Context for the changes in this PR:
The motivation comes from fixing the "`torch::tensor({{1}, {2}})` gives tensor of wrong sizes" bug - in order to fix it, I have to move the handling of `at::ArrayRef` and `std::vector` into `InitListTensor` (see below on why we need to do this) and renamed `InitListTensor` to `TensorDataContainer`. After such changes, support for bool values comes out of the box without extra effort, and support for tensors with zero-size dimensions only requires adding a default constructor for `TensorDataContainer`, so I added those two in this PR.
For the semantic change of `torch::tensor(1.1)`, it's actually more effort to preserve the original wrong behavior (i.e. we need to check the sizes of the tensor converted from `TensorDataContainer` and reshape any scalar tensor to a 1-D tensor). I think preserving the original wrong behavior doesn't give us much value, and since the above changes naturally fix the problem, we should just start using the right behavior instead.
For the "constructor with non-dtype options behavior" fix, the code looks simpler and easier to reason about with the fix, so I included it in this PR.
--------
Why we need to move the handling of `at::ArrayRef` and `std::vector` into `TensorDataContainer`:
`torch::tensor({{1}, {2}})` can match this function overload:
`torch::tensor(at::ArrayRef<int> values)`, because `{1}` and `{2}` can be treated as
a list-initialization of an `int` value. However, this will produce a Tensor with sizes `{2}`,
but we actually want a Tensor with sizes `{2, 1}`. In order to avoid matching this function overload,
we removed the function overload and moved the ability to convert `at::ArrayRef<T>`
(and similarly `std::vector<T>`) into `TensorDataContainer`, and since for braced-init-list the
`TensorDataContainer(std::initializer_list<TensorDataContainer>)` constructor is always preferred over all other constructors, it will take the `std::initializer_list` path, and all is good.
Test Plan: Imported from OSS
Differential Revision: D18234625
Pulled By: yf225
fbshipit-source-id: 0f3f6912e82e2117d2103e31b74e7e97baaa8693
Summary:
Adds `torch::nn::functional::fold` support and updates `Fold::pretty_print` in the C++ API for more thorough Python parity.
Note: Small updates in source files to maintain consistency elsewhere.
Reviewer: yf225
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28732
Differential Revision: D18219955
Pulled By: yf225
fbshipit-source-id: fd2e9be8f17db77c1b1f384c0d2e16cc34858c0c
Summary:
Before, we would only give the key we are looking for (i.e. typically
just "No such serialized tensor 'weight'", no matter for which submodule
we were looking for a weight.
Now we error with "No such serialized tensor '0.conv1.weight'" or
similar.
The analogous information is added to missing module error messages.
I threw in a test, and it saved me already...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28499
Differential Revision: D18122442
Pulled By: yf225
fbshipit-source-id: a134b6d06ca33de984a11d6fea923244bcd9fb95
Summary:
Add torch::nn::BatchNorm1d function/module support for the C++ API.
torch::nn::BatchNorm{2,3}d will be added after this PR is merged.
Related Issue: https://github.com/pytorch/pytorch/issues/25883
Reviewer: yf225
I would like to discuss about below items.
* Necessity of `num_batches_tracked` in `BatchNormImplBase`
* `num_batches_tracked` is needed to calculate `momentum` when we do not feed `momentum` argument in Python API. But in C++ API, `momentum` argument has a default value.
* `num_batches_tracked` is only used for counting up `BatchNorm1d::foward()` call. I think it is no necessary for user anymore.
* The design of `BatchNorm{1,2,3}dOptions`
* We have already `BatchNormOptions` used for deprecated `BatchNorm` module. However, it is hard to use it for `BatchNorm{1,2,3}dOptions` because of the arguments disagreement of each modules.
* In this PR, I introduce `BatchNormOptionsv2` template class for the `BatchNorm{1,2,3}dOptions`. But I'm not sure this design is good or not.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28176
Differential Revision: D18196843
Pulled By: yf225
fbshipit-source-id: 667e2b5de4150d5776c41b9088c9e6c2ead24cd4
Summary:
I finally found a way to get the following API to work for constructing a list of named submodules for `Sequential`:
```cpp
Sequential sequential({
{"m1", MyModule(1)},
{"m2", MyModule(2)}
})`
```
which was actually our original proposed design and much simpler than our current API:
```cpp
Sequential sequential(modules_ordered_dict({
{"m1", MyModule(1)},
{"m2", MyModule(2)}
}));
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28774
Differential Revision: D18174013
Pulled By: yf225
fbshipit-source-id: 3a18c2d36b6a65a07bee7346a7516780567c7774
Summary:
Changelog:
- Guard inclusion of certain files in torch/csrc/distributed included in caffe2/CMakeLists.txt when USE_DISTRIBUTED=0
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28621
Test Plan:
- Builds should be successful
- Tests should pass
Differential Revision: D18145330
Pulled By: ezyang
fbshipit-source-id: 7167a356b03ae783e6b0120f2ad3552db2b3ed86
Summary:
This PR is BC-breaking in the following way:
Previous, we require the use of `std::string` to specify the mode for `EmbeddingBag`. After this PR, we use variant-based enums such as `torch::kSum` / `torch::kMean` / `torch::kMax` to specify the mode for `EmbeddingBag`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28330
Differential Revision: D18127116
Pulled By: yf225
fbshipit-source-id: 15cd86c764777f4d399587be92cda15b6ce8524b
Summary:
https://github.com/pytorch/pytorch/issues/25883
I put grid_sample in vision.h with affine grid.
I have a question in string argument(interpolation mode, padding mode)
I reuse torch::native::detail::GridSamplerInterpolation in GridSampler.h instead of using string.
It follows the way that uses reduction enum in loss functions.
I am not sure this is right.
yf225
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28354
Differential Revision: D18109333
Pulled By: yf225
fbshipit-source-id: 1bf972b671b107464f73b937bbe0de76fb259fbf
Summary:
Sequential does not like modules added to it to take Tensor&
(const Tensor& and Tensor are both OK).
Functional and others use Tensor when they want to potentially
change things in-place.
This changes ReLU and friends to also do that.
Unfortunately, this seems to be BC breaking on the ABI level.
On the other hand, use of the module ReLU seems rare enough outside
Sequential (in particular in C++ models, the standard seems to be
to use torch::relu instead).
is the BC breaking OK here? (yf225 or anyone else)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28501
Differential Revision: D18089978
Pulled By: yf225
fbshipit-source-id: ac9aba6dc2081117dece57cd8a15bafe14ec8f51
Summary:
This PR adds ```MSELoss```, ```KLDivLoss``` and ```BCELoss```. The tests for ```BCELoss``` fail with the following error:
```
unknown file: Failure
C++ exception with description "autograd_meta() INTERNAL ASSERT FAILED at /home/shahriar/Contrib/pytorch/c10/core/TensorImpl.h:533, please report a bug to PyTorch. set_requires_grad is not implemented for Tensor (set_requires_grad at /home/shahriar/Contrib/pytorch/c10/core/TensorImpl.h:533)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27156
Differential Revision: D17960323
Pulled By: yf225
fbshipit-source-id: 84b8431064f2f573679c03a8d7994e3e2f81a4d1
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28129
The previous PR in the stack removed the need to order classes/functions
or have correct import statements. This resolved circular depedency issues
that can arise when class constructors like ModuleList put new instances
of themselves in a common namespace.
This PR changes our export format to no longer produce this information.
By doing so we can make the logic signficantly simpler, since we just
keep track of an individual PythonPrint object per file.
Notes:
* PythonPrint was changed to manage its own stream/list of ranges. It
was doing this anyway internally, this just makes the API more clear.
* Since we are changing the serialization format, I also removed op_version_set.
It is now replaced with the VERSION number that written in the zip archive.
This further simplifies the code emission process.
* A test of op_version_set was removed since there is no longer any behavior
to test.
Test Plan: Imported from OSS
Differential Revision: D17961610
Pulled By: zdevito
fbshipit-source-id: ada362c4ca34d05393a1a7e799c94785ab9d9825
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27947
Don't throw exception if the requested size is the same as the currently
used one
Test Plan:
ATEN_THREADING=NATIVE python setup.py develop --cmake
Imported from OSS
Differential Revision: D17919416
fbshipit-source-id: 411f7c9bd6a46e7a003b43a200c2ce3b76453a2e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26767
Now that we have tagged ivalues, we can accurately recover the type with
`ivalue.type()`. This reomoves the other half-implemented pathways that
were created because we didn't have tags.
Test Plan: Imported from OSS
Differential Revision: D17561191
Pulled By: zdevito
fbshipit-source-id: 26aaa134099e75659a230d8a5a34a86dc39a3c5c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26572
Combined with isinstance specialization this allows a degree of polymorphic
functions to work without needing to use our weirder overload hacks.
We do not define any operators on Any, so the only thing you can do with it
is to put it in containers or type refine it using an isinstance check.
Any is restricted from appearing in non-argument position because we
cannot restore type tags if it ends up as a field in a class.
Test Plan: Imported from OSS
Differential Revision: D17530643
Pulled By: zdevito
fbshipit-source-id: f06f78ce84819f7773953a492f3d4c49219ee94c
Summary:
Adds support for the Bilinear layer to the C++ frontend
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26082
Differential Revision: D17954148
Pulled By: yf225
fbshipit-source-id: 5e746bdea29b00e25969cd7a22044b8059b53687
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28039
Right now, torch::save() uses std::ostream, which results in unnecessary
data copies in practice. Similar for torch::load().
Adding a std::function<size_t(const void*, size_t)> as an output option,
parallel to the existing filename and std::ostream apis, gives users the
flexibility to emit directly to a backing store.
For a simple case of appending the output to a std::string, we observe
significant benchmark savings (on order of -50%), even with the
minor std::function<> dispatch overhead. The main reason is that
std::ostringstream effectively requires 2 extra copies of the data
beyond a simple string.append lambda.
We also provide a parallel api for the load(), though this one is
slightly more complex due to the need to do arbitrary position reads.
Test Plan:
buck test mode/dev-nosan caffe2/test/...
(Basic serialization test in caffe2/test/cpp/api/serialize.cpp)
Benchmark in experimental/jeremyl/c2/SerializationBench.cpp, with D17823443
(1M time goes from 90ms -> 40ms, albeit with crc patch applied)
Differential Revision: D17939034
fbshipit-source-id: 344cce46f74b6438cb638a8cfbeccf4e1aa882d7
Summary:
Right now, torch::save() uses std::ostream, which results in unnecessary
data copies in practice. Similar for torch::load().
Adding a std::function<size_t(const void*, size_t)> as an output option,
parallel to the existing filename and std::ostream apis, gives users the
flexibility to emit directly to a backing store.
For a simple case of appending the output to a std::string, we observe
significant benchmark savings (on order of -50%), even with the
minor std::function<> dispatch overhead. The main reason is that
std::ostringstream effectively requires 2 extra copies of the data
beyond a simple string.append lambda.
We also provide a parallel api for the load(), though this one is
slightly more complex due to the need to do arbitrary position reads.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27586
Test Plan:
buck test mode/dev-nosan caffe2/test/...
(Basic serialization test in caffe2/test/cpp/api/serialize.cpp)
Benchmark in experimental/jeremyl/c2/SerializationBench.cpp, with D17823443
(1M time goes from 90ms -> 40ms, albeit with crc patch applied)
Differential Revision: D17822962
Pulled By: jjlilley
fbshipit-source-id: d344a7e59707f3b30d42280fbab78f87399e4d10
Summary:
`at::ArrayRef` / `torch::IntArrayRef` should be discouraged in user code, because users might not be aware of the fact that it doesn't own the underlying data, which already leads to memory access bugs when they try to write the following:
```cpp
auto expected_sizes = torch::IntArrayRef({2, 16, 6}); // The memory that represents `{2, 16, 6}` is released after this line
ASSERT_EQ(output.sizes(), expected_sizes); // `expected_sizes` is pointing to invalid memory region
```
This PR changes all usage of `at::ArrayRef` and `torch::IntArrayRef` to the corresponding `std::vector` version, so that users won't pick up the habit of using `ArrayRef` by looking at the test code.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27884
Differential Revision: D17921646
Pulled By: yf225
fbshipit-source-id: 461e79fc22b598aac230d36cc028085ce6cbe937
Summary:
In accordance with https://github.com/pytorch/pytorch/issues/25883, I added the `MultiLabelSoftMarginLoss` module and `multilabel_soft_margin_loss` functional.
It looks like there isn't a C++ ATen implementation of `multilabel_soft_margin_loss`, so I translated the python version, which does not rely on a C/C++ backend either.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27669
Differential Revision: D17907608
Pulled By: yf225
fbshipit-source-id: ccb02951e009973c2adbe604593ce929f10c39eb
Summary:
Per https://github.com/pytorch/pytorch/issues/25525 we want to clean up distributed autograd context on all nodes, in addition to the local one. To do this, we want to send async RPCs to the other nodes telling them to clean up the context.
The first step for this is for a node's context to know about the other workers. This PR does two things:
1) Adds the necessary data structures and getter functions to `DistAutogradContext`
2) Refactors calls to `addSendRpcBackward` to take in the `worker_id` as an additional argument
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26324
Differential Revision: D17769411
Pulled By: rohan-varma
fbshipit-source-id: b7327d1209a574e2e88cb197edff3103024d51ad
Summary:
Addresses https://github.com/pytorch/pytorch/issues/27048
PR Summary:
Files Added:
_torch/csrc/api/include/torch/nn/options/normalization.h
torch/csrc/api/include/torch/nn/functional/normalization.h_
Files Modified:
_test/cpp/api/functional.cpp
torch/csrc/api/include/torch/nn/functional.h_
---
yf225 : I couldn't find a C++ equivalent of gradcheck(), is there such a function or is it sufficient to call .backward() in the test body? I don't think any solutions are checked for the Python tests.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27280
Differential Revision: D17902109
Pulled By: yf225
fbshipit-source-id: 1bce1a88103d0f1848633fec90fde95ea8f3d1ed