Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35022
This PR fixes `AdaptiveAvgPool{2,3}d` and `AdaptiveMaxPool{2,3}d` implementation to match the Python API implementation. Particularly, `output_size` is changed to accept `c10::nullopt` in its elements, matching the Python API behavior.
**TODO**: cherry-pick this PR into v1.5 release branch.
Test Plan: Imported from OSS
Differential Revision: D20559890
Pulled By: yf225
fbshipit-source-id: ccddbd278dd39165cf1dda11fc0e49387c76dbef
Summary:
This PR adds `RNNCell` / `LSTMCell` / `GRUCell` layers to the C++ frontend, with implementations exactly matching the Python API equivalent.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34400
Differential Revision: D20316859
Pulled By: yf225
fbshipit-source-id: bb7cee092622334043c0d0fd0fcb4e75e707699c
Summary:
This PR is BC-breaking in the following way:
- The deprecated `torch::nn::BatchNorm` is removed in favor of `torch::nn::BatchNorm{1,2,3}d`
- The deprecated `torch::nn::FeatureDropout` is removed in favor of `torch::nn::Dropout{2,3}d`
- The deprecated `torch::nn::modules_ordered_dict` is removed. User should do `Sequential sequential({{"m1", MyModule(1)}, {"m2", MyModule(2)}})` instead.
- The deprecated `torch::nn::init::Nonlinearity` is removed, in favor of the following enums:
- `torch::kLinear`
- `torch::kConv1D`
- `torch::kConv2D`
- `torch::kConv3D`
- `torch::kConvTranspose1D`
- `torch::kConvTranspose2D`
- `torch::kConvTranspose3D`
- `torch::kSigmoid`
- `torch::kTanh`
- `torch::kReLU`
- `torch::kLeakyReLU`
- The deprecated `torch::nn::init::FanMode` is removed, in favor of the following enums:
- `torch::kFanIn`
- `torch::kFanOut`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34508
Differential Revision: D20351601
Pulled By: yf225
fbshipit-source-id: cca0cd112f29a31bb023e348ca8f82780e42bea3
Summary:
Hi yf225,
I have a few doubts related to implementation:
1) What tests do I have to write?
2) What does _load_state_from_dict does?
3) Do I need to override reset() function as I can not see it's utility?
4) InstanceNormOptions could be removed with BatchNormOptions, but I find that
`track_running_status` is not defined instead `stateful` is defined.
InstanceNorm{1,2,3}d https://github.com/pytorch/pytorch/issues/25883
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28790
Differential Revision: D18588666
Pulled By: yf225
fbshipit-source-id: bb9b81f01f62c3fc8765fa0ba0716768087ee155
Summary:
Hi yf225 , I have added **NLLLoss and CrossEntropyLoss.**
```
Also, while using log_softmax in cross_entropy_loss, I am getting an error
../caffe2/../torch/csrc/api/include/torch/nn/functional/loss.h:537:63: error: no matching function for call to log_softmax(const at::Tensor&)’
const Tensor& log_softmax_input = torch::log_softmax(input);
aten/src/ATen/Functions.h:5551:22: note: candidate: at::Tensor at::log_softmax(const at::Tensor&, int64_t, c10::optional<c10::ScalarType>)
static inline Tensor log_softmax(const Tensor & self, int64_t dim, c10::optional<ScalarType> dtype) {
^~~~~~~~~~~
aten/src/ATen/Functions.h:5551:22: note: candidate expects 3 arguments, 1 provided
```
I think the other two parameters should be optional as in python frontend(shown in documentation here at https://pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.log_softmax ). Rest, there were no errors in build and tests have passed
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29812
Differential Revision: D18548249
Pulled By: yf225
fbshipit-source-id: 2ab350abd2a6f498d4dba2345f51ad87471f3038
Summary:
This PR changes the implementation of C++ Conv{1,2,3}d layers to exactly match the Python version, and add F::conv{1,2,3}d functionals. For more thorough testing, I will rely on the parity test mechanism which uses values from `common_nn.py` to generate the inputs and options that we are interested in testing.
This PR is BC-breaking in the following way:
In `Conv{1,2,3}dOptions`:
- `with_bias` is renamed to `bias`.
- `input_channels` is renamed to `in_channels`.
- `output_channels` is renamed to `out_channels`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28917
Differential Revision: D18471526
Pulled By: yf225
fbshipit-source-id: 7a33f60654ad93cc2e043245e7ff9e0ef9da15b3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29404
This PR makes all non-input arguments to functionals part of its options parameters, so that we won't break backward compatibility even if we add or reorder some of the non-input arguments to functionals in the future.
Test Plan: Imported from OSS
Differential Revision: D18378526
Pulled By: yf225
fbshipit-source-id: f5cf6bdfb844e75bf94fdee58c121e0955631b6e
Summary:
Fixes https://github.com/pytorch/pytorch/issues/17662
I'm not sure if `arange` needs to be in python_arg_parser at all, given the schemas in native_functions.yaml. In any case this at least fixes the dytpe mismatch.
In follow up PRs I will try to handle some of the other ops that do type inference at the python level, like randint.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27629
Differential Revision: D17885939
Pulled By: eellison
fbshipit-source-id: f97a8bc722b7ab77de1c42a992e49a4a3175ad60
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29364
Currently, we use `torch::nn::*Options` both as module options and functional options. However, this makes it very hard to manage the parameters in `torch::nn::*Options`, because a module's constructor can take a different set of arguments than the module's equivalent functional (e.g. `torch.nn.BatchNorm1d` takes `num_features, eps=1e-5, momentum=0.1, affine=True,
track_running_stats=True`, while `F::batch_norm` takes `running_mean, running_var, weight=None, bias=None, training=False, momentum=0.1, eps=1e-5`).
This PR resolves the above problem by making `F::*FuncOptions` a different class from `torch::nn::*Options` when necessary (i.e. when a module's constructor takes a different set of arguments than the module's equivalent functional). In the rest of the cases where the module constructor takes the same set of arguments as the module's equivalent functional, `F::*FuncOptions` is an alias of `torch::nn::*Options`.
Also as part of this PR, we change all functional options to pass-by-value, to make the semantics consistent across all functionals.
Test Plan: Imported from OSS
Differential Revision: D18376977
Pulled By: yf225
fbshipit-source-id: 8d9c240d93bfd5af0165b6884fdc912476b1d06b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28523
New features:
1. Previously, `torch::tensor({true, false, true})` throws `"tensor_cpu" not implemented for 'Bool'`. After this PR, it produces the correct bool tensor, matching the Python API behavior.
2. Tensors with zero-size dimensions are now supported, e.g. `torch::tensor({{}, {}})` produces a tensor with sizes `{2, 0}`, matching the Python API behavior.
BC-breaking bug fixes:
1. Previously, `torch::tensor({{1}, {2}})` produces a tensor of sizes `{2}`. After this PR, it produces a tensor of sizes `{2, 1}`, matching the Python API behavior.
2. Fixed semantics of `torch::tensor(1.1)`: it now returns a 0-dim tensor instead of a 1-dim tensor, matching the Python API behavior.
3. Previously, when passed a non-dtype `TensorOptions` to the `torch::tensor` constructor, it always produces a tensor of dtype `float`. After this PR, it produces tensor of different dtypes based on the dtype of the braced-init-list, matching the behavior of the no-options case.
```cpp
// Previously:
torch::tensor({1, 2, 3}, torch::TensorOptions(/*non-dtype-options*/)).dtype() -> float
torch::tensor({{1, 2, 3}}, torch::TensorOptions(/*non-dtype-options*/)).dtype() -> float
torch::tensor({1., 2., 3.}, torch::TensorOptions(/*non-dtype-options*/)).dtype() -> float
torch::tensor({{1., 2., 3.}}, torch::TensorOptions(/*non-dtype-options*/)).dtype() -> float
// Now:
torch::tensor({1, 2, 3}, torch::TensorOptions(/*non-dtype-options*/)).dtype() -> int
torch::tensor({{1, 2, 3}}, torch::TensorOptions(/*non-dtype-options*/)).dtype() -> int
torch::tensor({1., 2., 3.}, torch::TensorOptions(/*non-dtype-options*/)).dtype() -> double
torch::tensor({{1., 2., 3.}}, torch::TensorOptions(/*non-dtype-options*/)).dtype() -> double
// As comparison, currently:
torch::tensor({1, 2, 3}).dtype() -> int
torch::tensor({{1, 2, 3}}).dtype() -> int
torch::tensor({1., 2., 3.}).dtype() -> double
torch::tensor({{1., 2., 3.}}).dtype() -> double
```
Notes:
1. From now on, the behavior of `at::tensor(scalar_value)` (which produces a 1-dim tensor) would be different from `torch::tensor(scalar_value)` (which produces a 0-dim tensor). I will fix the behavior of `at::tensor(scalar_value)` in a follow-up PR.
2. From now on, the behavior of `at::tensor({1, 2, 3}, torch::TensorOptions(/*non-dtype-options*/))` (which produces a `float` tensor) would be different from `torch::tensor({1, 2, 3}, torch::TensorOptions(/*non-dtype-options*/))` (which produces a an `int` tensor). I will fix this behavior of `at::tensor` constructor in a follow-up PR.
Context for the changes in this PR:
The motivation comes from fixing the "`torch::tensor({{1}, {2}})` gives tensor of wrong sizes" bug - in order to fix it, I have to move the handling of `at::ArrayRef` and `std::vector` into `InitListTensor` (see below on why we need to do this) and renamed `InitListTensor` to `TensorDataContainer`. After such changes, support for bool values comes out of the box without extra effort, and support for tensors with zero-size dimensions only requires adding a default constructor for `TensorDataContainer`, so I added those two in this PR.
For the semantic change of `torch::tensor(1.1)`, it's actually more effort to preserve the original wrong behavior (i.e. we need to check the sizes of the tensor converted from `TensorDataContainer` and reshape any scalar tensor to a 1-D tensor). I think preserving the original wrong behavior doesn't give us much value, and since the above changes naturally fix the problem, we should just start using the right behavior instead.
For the "constructor with non-dtype options behavior" fix, the code looks simpler and easier to reason about with the fix, so I included it in this PR.
--------
Why we need to move the handling of `at::ArrayRef` and `std::vector` into `TensorDataContainer`:
`torch::tensor({{1}, {2}})` can match this function overload:
`torch::tensor(at::ArrayRef<int> values)`, because `{1}` and `{2}` can be treated as
a list-initialization of an `int` value. However, this will produce a Tensor with sizes `{2}`,
but we actually want a Tensor with sizes `{2, 1}`. In order to avoid matching this function overload,
we removed the function overload and moved the ability to convert `at::ArrayRef<T>`
(and similarly `std::vector<T>`) into `TensorDataContainer`, and since for braced-init-list the
`TensorDataContainer(std::initializer_list<TensorDataContainer>)` constructor is always preferred over all other constructors, it will take the `std::initializer_list` path, and all is good.
Test Plan: Imported from OSS
Differential Revision: D18234625
Pulled By: yf225
fbshipit-source-id: 0f3f6912e82e2117d2103e31b74e7e97baaa8693
Summary:
Adds `torch::nn::functional::fold` support and updates `Fold::pretty_print` in the C++ API for more thorough Python parity.
Note: Small updates in source files to maintain consistency elsewhere.
Reviewer: yf225
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28732
Differential Revision: D18219955
Pulled By: yf225
fbshipit-source-id: fd2e9be8f17db77c1b1f384c0d2e16cc34858c0c
Summary:
Add torch::nn::BatchNorm1d function/module support for the C++ API.
torch::nn::BatchNorm{2,3}d will be added after this PR is merged.
Related Issue: https://github.com/pytorch/pytorch/issues/25883
Reviewer: yf225
I would like to discuss about below items.
* Necessity of `num_batches_tracked` in `BatchNormImplBase`
* `num_batches_tracked` is needed to calculate `momentum` when we do not feed `momentum` argument in Python API. But in C++ API, `momentum` argument has a default value.
* `num_batches_tracked` is only used for counting up `BatchNorm1d::foward()` call. I think it is no necessary for user anymore.
* The design of `BatchNorm{1,2,3}dOptions`
* We have already `BatchNormOptions` used for deprecated `BatchNorm` module. However, it is hard to use it for `BatchNorm{1,2,3}dOptions` because of the arguments disagreement of each modules.
* In this PR, I introduce `BatchNormOptionsv2` template class for the `BatchNorm{1,2,3}dOptions`. But I'm not sure this design is good or not.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28176
Differential Revision: D18196843
Pulled By: yf225
fbshipit-source-id: 667e2b5de4150d5776c41b9088c9e6c2ead24cd4
Summary:
This PR is BC-breaking in the following way:
Previous, we require the use of `std::string` to specify the mode for `EmbeddingBag`. After this PR, we use variant-based enums such as `torch::kSum` / `torch::kMean` / `torch::kMax` to specify the mode for `EmbeddingBag`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28330
Differential Revision: D18127116
Pulled By: yf225
fbshipit-source-id: 15cd86c764777f4d399587be92cda15b6ce8524b
Summary:
This PR adds ```MSELoss```, ```KLDivLoss``` and ```BCELoss```. The tests for ```BCELoss``` fail with the following error:
```
unknown file: Failure
C++ exception with description "autograd_meta() INTERNAL ASSERT FAILED at /home/shahriar/Contrib/pytorch/c10/core/TensorImpl.h:533, please report a bug to PyTorch. set_requires_grad is not implemented for Tensor (set_requires_grad at /home/shahriar/Contrib/pytorch/c10/core/TensorImpl.h:533)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27156
Differential Revision: D17960323
Pulled By: yf225
fbshipit-source-id: 84b8431064f2f573679c03a8d7994e3e2f81a4d1
Summary:
Adds support for the Bilinear layer to the C++ frontend
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26082
Differential Revision: D17954148
Pulled By: yf225
fbshipit-source-id: 5e746bdea29b00e25969cd7a22044b8059b53687
Summary:
`at::ArrayRef` / `torch::IntArrayRef` should be discouraged in user code, because users might not be aware of the fact that it doesn't own the underlying data, which already leads to memory access bugs when they try to write the following:
```cpp
auto expected_sizes = torch::IntArrayRef({2, 16, 6}); // The memory that represents `{2, 16, 6}` is released after this line
ASSERT_EQ(output.sizes(), expected_sizes); // `expected_sizes` is pointing to invalid memory region
```
This PR changes all usage of `at::ArrayRef` and `torch::IntArrayRef` to the corresponding `std::vector` version, so that users won't pick up the habit of using `ArrayRef` by looking at the test code.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27884
Differential Revision: D17921646
Pulled By: yf225
fbshipit-source-id: 461e79fc22b598aac230d36cc028085ce6cbe937
Summary:
In accordance with https://github.com/pytorch/pytorch/issues/25883, I added the `MultiLabelSoftMarginLoss` module and `multilabel_soft_margin_loss` functional.
It looks like there isn't a C++ ATen implementation of `multilabel_soft_margin_loss`, so I translated the python version, which does not rely on a C/C++ backend either.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27669
Differential Revision: D17907608
Pulled By: yf225
fbshipit-source-id: ccb02951e009973c2adbe604593ce929f10c39eb
Summary:
Hi yf225 , I had to create a new branch to tackle merge conflict since I am using cloud due to some limitations on my PC. Therefore, I don't have enough command there.
Also, I have incorporated the changes you have put before here
https://github.com/pytorch/pytorch/pull/27613
Also, it would be great if you could recommend me some resources to work smmothly on GCP..:-D
Thank you
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27713
Differential Revision: D17899695
Pulled By: yf225
fbshipit-source-id: eb6643223148774a5cbbd093bdcc5623872e5bba
Summary:
Hi yf225 , here is the C++ frontend API MultiMarginLoss implementation and tests https://github.com/pytorch/pytorch/issues/27198. Could you review it and tell me if it is okay?
I am not entirely sure I used `c10::optional` correctly, but `options.weight()` resulted in a compilation error, so I went with `options.weight().value()` instead of `value_or()` to follow the logic in `torch.nn._WeightedLoss.register_buffer` (where one can pass a `None` value).
Oh, and are the tests supposed to be skipped or did I do something wrong? I ran `pytest test/test_cpp_api_parity.py -k Loss -v` , and the `L1Loss` test passed but the others were skipped...
Thank you for the review in any case!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27424
Differential Revision: D17839963
Pulled By: yf225
fbshipit-source-id: f4b6012590cf22d56d42751c214df80cce717cb8
Summary:
added more variables to EmbeddingOptions and updated EmbeddingImpl reset, forward functions. Also added EmbeddingBag.
-----
This PR is BC-breaking in the following way:
Previously, `EmbeddingOptions` supports `count` and `dimension` as options arguments. After this PR, they are renamed to `num_embeddings` and `embedding_dim` respectively.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26358
Differential Revision: D17714337
Pulled By: yf225
fbshipit-source-id: f9f969c68e4bece106b92f8e2e02ac39c8455fb7
Summary:
This PR makes the following improvements:
1. Add `forward_with_indices` method to all C++ MaxPool modules, to return the max indices along with the outputs. (We can't make two `forward` methods that return different types based on input, because that will break the type deduction of `torch::detail::return_type_of_forward_t`)
2. Add `max_poolNd_with_indices` to `torch::nn::functional`, to be used when indices of the max values are needed. (We can't merge this with `torch::nn::functional::max_poolNd` because the return type of `max_poolNd` has to be defined statically).
3. Improve `pretty_print` of C++ MaxPoolNd and AvgPoolNd modules to match the Python `extra_repr`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26521
Differential Revision: D17507358
Pulled By: yf225
fbshipit-source-id: b6c0e2b27b38378cdc0c75f4bfc797b3c6b17cd9
Summary:
C++ `nn::Distance` tests can take advantage of the newly released multi-dimensional tensor constructor https://github.com/pytorch/pytorch/pull/26210 to simplify the tensor constructions.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26539
Differential Revision: D17501041
Pulled By: yf225
fbshipit-source-id: 21d5f95ab3ec02227115c823c581218cee2ce458
Summary:
This PR adds Average Pool module to C++ front-end.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25800
Differential Revision: D17318094
Pulled By: yf225
fbshipit-source-id: c914c0e802bbe5f1d1f0a21a669c28bc956899db
Summary:
yf225 This is L1Loss module. I don't think that ```_Loss``` and ```_WeightedLoss``` as base Python classes do anything. First one sets reduction type and also takes in ```reduce``` parameter which is deprecated. The second one only registers ```weight``` parameter. I don't think that we should keep this structure. What do you think?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25902
Differential Revision: D17307045
Pulled By: yf225
fbshipit-source-id: ad3eda2ee8dcf4465054b376c1be89b39d11532f
Summary:
This PR simplifies header inclusion in `test/cpp/api/modules.cpp`, so that when we add a new `torch::nn` module and add the test in `modules.cpp`, we can check that the new module's header is included in `torch/torch.h`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25921
Differential Revision: D17303220
Pulled By: yf225
fbshipit-source-id: 327db0ff2f075d52e7b594b3dffc5a59441e0931