Commit Graph

505 Commits

Author SHA1 Message Date
James Butterworth
37ab711822 Adding learning rate schedulers to C++ API (#52268)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/50577

Learning rate schedulers had not yet been implemented for the C++ API.

This pull request introduces the learning rate scheduler base class and the StepLR subclass. Furthermore, it modifies the existing OptimizerOptions such that the learning rate scheduler can modify the learning rate.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/52268

Reviewed By: mrshenli

Differential Revision: D26818387

Pulled By: glaringlee

fbshipit-source-id: 2b28024a8ea7081947c77374d6d643fdaa7174c1
2021-03-10 23:09:51 -08:00
Sam Estep
8c798e0622 Forbid trailing whitespace (#53406)
Summary:
Context: https://github.com/pytorch/pytorch/pull/53299#discussion_r587882857

These are the only hand-written parts of this diff:
- the addition to `.github/workflows/lint.yml`
- the file endings changed in these four files (to appease FB-internal land-blocking lints):
  - `GLOSSARY.md`
  - `aten/src/ATen/core/op_registration/README.md`
  - `scripts/README.md`
  - `torch/csrc/jit/codegen/fuser/README.md`

The rest was generated by running this command (on macOS):
```
git grep -I -l ' $' -- . ':(exclude)**/contrib/**' ':(exclude)third_party' | xargs gsed -i 's/ *$//'
```

I looked over the auto-generated changes and didn't see anything that looked problematic.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/53406

Test Plan:
This run (after adding the lint but before removing existing trailing spaces) failed:
- https://github.com/pytorch/pytorch/runs/2043032377

This run (on the tip of this PR) succeeded:
- https://github.com/pytorch/pytorch/runs/2043296348

Reviewed By: walterddr, seemethere

Differential Revision: D26856620

Pulled By: samestep

fbshipit-source-id: 3f0de7f7c2e4b0f1c089eac9b5085a58dd7e0d97
2021-03-05 17:22:55 -08:00
kshitij12345
c4c77e2001 [special] add torch.special namespace (#52296)
Summary:
Reference: https://github.com/pytorch/pytorch/issues/50345

 * Add `torch.special` namespace
* Add `torch.special.gammaln` (alias to `torch.lgamma`)

TODO:
* Add proper entries for docs.
   * [x] Add .rst file entry
   * [x] Add documentation
   * [x] Update `lgamma` OpInfo entry for alias to `special.gammaln`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/52296

Reviewed By: ngimel

Differential Revision: D26754890

Pulled By: mruberry

fbshipit-source-id: 73479f68989d6443ad07b7b02763fa98973c15f6
2021-03-04 00:04:36 -08:00
Joel Schlosser
e86476f736 Huber loss (#50553)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/48595.

## Background

This PR implements HuberLoss, which differs from SmoothL1Loss by a factor of beta. The current implementation does not share logic between the two. Feedback is welcome for the optimal way to minimize code duplication while remaining performant.

I've done some early [benchmarking](https://pytorch.org/tutorials/recipes/recipes/benchmark.html#collecting-instruction-counts-with-callgrind) with Huber calling in to the Smooth L1 kernel and scaling afterwards; for the simple test case I used, instruction counts are as follows:
```
Huber loss calls dedicated Huber kernel: 2,795,300
Huber loss calls Smooth L1 kernel and scales afterwards: 4,523,612
```
With these numbers, instruction counts are ~62% higher when using the pre-existing Smooth L1 kernel.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50553

Test Plan:
```
python test/test_nn.py TestNN.test_HuberLoss
python test/test_nn.py TestNN.test_HuberLoss_delta
python test/test_nn.py TestNN.test_huber_loss_invalid_delta
python test/test_nn.py TestNNDeviceTypeCPU.test_smooth_l1_loss_vs_huber_loss_cpu
python test/test_nn.py TestNNDeviceTypeCUDA.test_smooth_l1_loss_vs_huber_loss_cuda
python test/test_nn.py TestNNDeviceTypeCPU.test_invalid_reduction_strings_cpu
python test/test_nn.py TestNNDeviceTypeCUDA.test_invalid_reduction_strings_cuda
python test/test_nn.py TestNN.test_loss_equal_input_target_shape
python test/test_nn.py TestNN.test_pointwise_loss_broadcast
python test/test_overrides.py
python test/test_jit.py TestJitGeneratedFunctional.test_nn_huber_loss
python test/test_type_hints.py
python test/test_cpp_api_parity.py
build/bin/test_api
```

## Documentation
<img width="677" alt="Screen Shot 2021-01-14 at 4 25 08 PM" src="https://user-images.githubusercontent.com/75754324/104651224-5a445980-5685-11eb-884b-14ea517958c2.png">
<img width="677" alt="Screen Shot 2021-01-14 at 4 24 35 PM" src="https://user-images.githubusercontent.com/75754324/104651190-4e589780-5685-11eb-974d-8c63a89c050e.png">
<img width="661" alt="Screen Shot 2021-01-14 at 4 24 45 PM" src="https://user-images.githubusercontent.com/75754324/104651198-50225b00-5685-11eb-958e-136b36f6f8a8.png">
<img width="869" alt="Screen Shot 2021-01-14 at 4 25 27 PM" src="https://user-images.githubusercontent.com/75754324/104651208-53b5e200-5685-11eb-9fe4-5ff433aa13c5.png">
<img width="862" alt="Screen Shot 2021-01-14 at 4 25 48 PM" src="https://user-images.githubusercontent.com/75754324/104651209-53b5e200-5685-11eb-8051-b0cfddcb07d3.png">

Reviewed By: H-Huang

Differential Revision: D26734071

Pulled By: jbschlosser

fbshipit-source-id: c98c1b5f32a16f7a2a4e04bdce678080eceed5d5
2021-03-02 17:30:45 -08:00
Jeffrey Wan
aa2fede201 Fix autograd when inputs contains tensors without materialized grad_fn (#51940)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/39784
At the time the issue was filed, there was only issue (1) below.

There are actually now two issues here:
1. We always set all inputs passed in through `inputs` arg as `needed = True` in exec_info. So if we pass in an input that has a grad_fn that is not materialized, we create an entry of exec_info with nullptr as key with `needed = True`. Coincidentally, when we perform simple arithmetic operations, such as "2 * x", one of the next edges of mul is an invalid edge, meaning that its grad_fn is also nullptr. This causes the discovery algorithm to set all grad_fns that have a path to this invalid_edge as `needed = True`.
2. Before the commit that enabled the engine skipped the dummy node, we knew that root node is always needed, i.e., we hardcode `exec_info[&graph_root]=true`. The issue was that this logic wasn't updated after the code was updated to skip the graph root.

To address (1), instead of passing in an invalid edge if an input in `inputs` has no grad_fn, we create a dummy grad_fn. This is done in both python and cpp entry points. The alternative is to add logic for both backward() and grad() cases to check whether the grad_fn is nullptr and set needed=false in that case (the .grad() case would be slightly more complicated than the .backward() case here).

For (2), we perform one final iteration of the discovery algorithm so that we really know whether we need to execute the graph root.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/51940

Reviewed By: VitalyFedyunin

Differential Revision: D26369529

Pulled By: soulitzer

fbshipit-source-id: 14a01ae7988a8de621b967a31564ce1d7a00084e
2021-02-11 09:22:15 -08:00
Yanli Zhao
c9cae1446f fix unflatten_dense_tensor when there is empty tensor inside (#50321)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50321

Quantization team reported that when there are two empty tensors are replicated among ranks, the two empty tensors start to share storage after resizing.

The root cause is unflatten_dense_tensor unflattened the empty tensor as view of flat tensor and thus share storage with other tensors.

This PR is trying to avoid unflatten the empty tensor as view of flat tensor so that empty tensor will not share storage with other tensors.

Test Plan: unit test

Reviewed By: pritamdamania87

Differential Revision: D25859503

fbshipit-source-id: 5b760b31af6ed2b66bb22954cba8d1514f389cca
2021-01-23 12:14:34 -08:00
Richard Barnes
89cafde8a4 Modernize for-loops (#50912)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50912

Test Plan: Sandcastle tests

Reviewed By: ansley

Differential Revision: D26001948

fbshipit-source-id: 3bfe6a8283a2b1882ed472f836ae1b6e720e519f
2021-01-22 10:53:24 -08:00
Edward Yang
8eee8460f8 codegen: Resolve overload ambiguities created by defaulted arguments (#49348)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49348

This is a redux of #45666 post refactor, based off of
d534f7d4c5
Credit goes to peterbell10 for the implementation.

Fixes #43945.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: smessmer

Differential Revision: D25594004

Pulled By: ezyang

fbshipit-source-id: c8eb876bb3348308d6dc8ba7bf091a2a3389450f
2021-01-04 11:59:16 -08:00
Sebastian Messmer
c7e9abb66a Making ops c10-full: list of optional tensors (#49138)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49138

See for details: https://fb.quip.com/QRtJAin66lPN

We need to model optional types explicitly, mostly for schema inference. So we cannot pass a `Tensor?[]` as `ArrayRef<Tensor>`, instead we need to pass it as an optional type. This PR changes it to `torch::List<c10::optional<Tensor>>`. It also makes the ops c10-full that were blocked by this.

## Backwards Compatibility

- This should not break the Python API because the representation in Python is the same and python_arg_parser just transforms the python list into a `List<optional<Tensor>>` instead of into a `List<Tensor>`.
- This should not break serialized models because there's some logic that allows loading a serialized `List<Tensor>` as `List<optional<Tensor>>`, see https://github.com/pytorch/pytorch/pull/49138/files#diff-9315f5dd045f47114c677174dcaa2f982721233eee1aa19068a42ff3ef775315R57
- This will break backwards compatibility for the C++ API. There is no implicit conversion from `ArrayRef<Tensor>` (which was the old argument type) to `List<optional<Tensor>>`. One common call pattern is `tensor.index({indices_tensor})`, where indices_tensor is another `Tensor`, and that will continue working because the `{}` initializer_list constructor for `List<optional<Tensor>>` can take `Tensor` elements that are implicitly converted to `optional<Tensor>`, but another common call pattern was `tensor.index(indices_tensor)`, where previously, the `Tensor` got implicitly converted to an `ArrayRef<Tensor>`, and to implicitly convert `Tensor -> optional<Tensor> -> List<optional<Tensor>>` would be two implicit conversions. C++ doesn't allow chaining. two implicit conversions. So those call sites have to be rewritten to `tensor.index({indices_tensor})`.

ghstack-source-id: 119269131

Test Plan:
## Benchmarks (C++ instruction counts):
### Forward
#### Script
```py
from torch.utils.benchmark import Timer

counts = Timer(
    stmt="""
        auto t = {{op call to measure}};
    """,
    setup="""
        using namespace torch::indexing;
        auto x = torch::ones({4, 4, 4});
    """,
    language="cpp",
).collect_callgrind(number=1_000)
print(counts)
```
#### Results
|  Op call                                                              |before   |after   |delta  |      |
|------------------------------------------------------------------------|---------|--------|-------|------|
|x[0] = 1                                                                |11566015 |11566015|0      |0.00% |
|x.index({0})                                                            |6807019  |6801019 |-6000  |-0.09%|
|x.index({0, 0})                                                         |13529019 |13557019|28000  |0.21% |
|x.index({0, 0, 0})                                                      |10677004 |10692004|15000  |0.14% |
|x.index({"..."})                                                        |5512015  |5506015 |-6000  |-0.11%|
|x.index({Slice(None, None, None)})                                      |6866016  |6936016 |70000  |1.02% |
|x.index({None})                                                         |8554015  |8548015 |-6000  |-0.07%|
|x.index({false})                                                        |22400000 |22744000|344000 |1.54% |
|x.index({true})                                                         |27624088 |27264393|-359695|-1.30%|
|x.index({"...", 0, true, Slice(1, None, 2), torch::tensor({1, 2})})|123472000|123463306|-8694|-0.01%|

### Autograd
#### Script
```py
from torch.utils.benchmark import Timer

counts = Timer(
    stmt="""
        auto t = {{op call to measure}};
    """,
    setup="""
        using namespace torch::indexing;
        auto x = torch::ones({4, 4, 4}, torch::requires_grad());
    """,
    language="cpp",
).collect_callgrind(number=1_000)
print(counts)
```
Note: the script measures the **forward** path of an op call with autograd enabled (i.e. calls into VariableType). It does not measure the backward path.

#### Results
|  Op call                                                              |before   |after   |delta  |      |
|------------------------------------------------------------------------|---------|--------|-------|------|
|x.index({0})                                                            |14839019|14833019|-6000| 0.00% |
|x.index({0, 0})                                                         |28342019|28370019|28000| 0.00% |
|x.index({0, 0, 0})                                                      |24434004|24449004|15000| 0.00% |
|x.index({"..."})                                                       |12773015|12767015|-6000| 0.00% |
|x.index({Slice(None, None, None)})                                      |14837016|14907016|70000| 0.47% |
|x.index({None})                                                        |15926015|15920015|-6000| 0.00% |
|x.index({false})                                                        |36958000|37477000|519000| 1.40% |
|x.index({true})                                                         |41971408|42426094|454686| 1.08% |
|x.index({"...", 0, true, Slice(1, None, 2), torch::tensor({1, 2})}) |168184392|164545682|-3638710| -2.16% |

Reviewed By: bhosmer

Differential Revision: D25454632

fbshipit-source-id: 28ab0cffbbdbdff1c40b4130ca62ee72f981b76d
2021-01-04 05:04:02 -08:00
anjali411
97c17b4772 Fix auto exponent issue for torch.pow (#49809)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49809

Fixes https://github.com/pytorch/xla/issues/2688 #46936

Test Plan: Imported from OSS

Reviewed By: nikithamalgifb

Differential Revision: D25724176

Pulled By: anjali411

fbshipit-source-id: 16287a1f481e9475679b99d6fb45de840da225be
2020-12-29 17:02:56 -08:00
Joel Schlosser
68d438c9da Add PixelUnshuffle (#49334)
Summary:
Adds an implementation of `torch.nn.PixelUnshuffle` as the inverse operation of `torch.nn.PixelShuffle`. This addresses https://github.com/pytorch/pytorch/issues/2456

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49334

Test Plan:
```
# Unit tests.
python test/test_nn.py TestNN.test_pixel_shuffle_unshuffle

# Module test.
python test/test_nn.py TestNN.test_PixelUnshuffle

# C++ API tests.
build/bin/test_api

# C++ / python parity tests.
python test/test_cpp_api_parity.py

# JIT test.
python test/test_jit.py TestJitGeneratedFunctional.test_nn_pixel_unshuffle

# Override tests.
python test/test_overrides.py

# Type hint tests.
python test/test_type_hints.py
```

Screenshots of rendered docs:
<img width="876" alt="Screen Shot 2020-12-18 at 12 19 05 PM" src="https://user-images.githubusercontent.com/75754324/102642255-6b07bb00-412b-11eb-88fa-e53e7e8ba720.png">
<img width="984" alt="Screen Shot 2020-12-18 at 12 19 26 PM" src="https://user-images.githubusercontent.com/75754324/102642276-70fd9c00-412b-11eb-8548-445082a2db02.png">
<img width="932" alt="Screen Shot 2020-12-18 at 12 19 34 PM" src="https://user-images.githubusercontent.com/75754324/102642704-19abfb80-412c-11eb-9546-95bdd1c3cf22.png">
<img width="876" alt="Screen Shot 2020-12-22 at 12 51 36 PM" src="https://user-images.githubusercontent.com/75754324/102918259-986aa680-4454-11eb-99e7-a0b4c8b3e283.png">
<img width="869" alt="Screen Shot 2020-12-22 at 12 51 44 PM" src="https://user-images.githubusercontent.com/75754324/102918274-9ef91e00-4454-11eb-94bb-91b58aff47d3.png">

Reviewed By: mruberry

Differential Revision: D25401439

Pulled By: jbschlosser

fbshipit-source-id: 209d92ce7295e51699e83616d0c62170a7ce75c8
2020-12-22 20:14:55 -08:00
Nikita Shulga
020c443fd1 Fix CustomAutogradTest.ReentrantPriority rerun failures (#49581)
Summary:
Clear static variable at the end of the test to ensure test passes after re-runs

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49581

Test Plan:
`./bin/test_api "--gtest_filter=CustomAutogradTest.ReentrantPriority" --gtest_repeat=50`
Before the change all subsequent runs of the test failed with
```
../test/cpp/api/autograd.cpp:681: Failure
Expected equality of these values:
  order.size()
    Which is: 310
  10
```

Reviewed By: mrshenli

Differential Revision: D25632374

Pulled By: malfet

fbshipit-source-id: 4814d22b5dff15e1b38a0187e51070771fd58370
2020-12-18 00:34:06 -08:00
Igor Gitman
1b6d18aa7c Adding support for CuDNN-based LSTM with projections (#47725)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/46213

I didn't yet update the documentation, will add those change soon. A few other things that I didn't do, but want to clarify if I maybe should.

1. I didn't expose projections in c++ API: torch/csrc/api/src/nn/modules/rnn.cpp. Let me know if this is desirable and I will add those changes.
2. I didn't expose projections in "lstm_cell" function and "_thnn_differentiable_lstm_cell_backward" functions from aten/src/ATen/native/RNN.cpp. As far as I understand, they are not needed for nn.LSTM CPU execution. For lstm_cell, projections don't bring any real benefit, since if cell is used separately, it can be easily added in Python. For "_thnn_differentiable_lstm_cell_backward", I'm actually not sure where exactly that function is used, so I also disabled projections there for now. Please let me know if I should change that.
3. I added check that projections are not supported for quantized LSTMs to quantized_lstm_<data/input> functions. But I didn't add any checks to LSTMCell code. It seems that since I disabled projections in "lstm_cell" function, they should also not be available for quantized models through any other API than quantized_lstm_<data/input>. Please let me know if I'm not correct and I will add checks to other places.
4. Projections are not supported for CuDNN versions < 7.1.2. Should I add the check for CuDNN version and disable projections in that case? If so, what will be the best way to do that?
5. Currently I added projection weight as the last weight, so the layout is "w_ih, w_hh, b_ih, b_hh, w_hr". This breaks the assumption that biases come after weights and thus I had to add additional if-s in various places. Alternative way would be to have "w_ih, w_hh, w_hr, b_ih, b_hh" layout, in which case the assumption will be true. But in that case I will need to split the loop in get_parameters function from aten/src/ATen/native/cudnn/RNN.cpp. And in some cases, I will still need to add an "undefined" tensor in the 3rd position, because we get all 5 weights from CuDNN most of the time. So I'm not sure which way is better. Let me know if you think I should change to the weights-then-biases layout.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47725

Reviewed By: zou3519

Differential Revision: D25449794

Pulled By: ngimel

fbshipit-source-id: fe6ce59e481d1f5fd861a8ff7fa13d1affcedb0c
2020-12-16 11:27:02 -08:00
Peter Bell
5180caeeb4 Remove deprecated spectral ops from torch namespace (#48594)
Summary:
Ref https://github.com/pytorch/pytorch/issues/42175

This removes the 4 deprecated spectral functions: `torch.{fft,rfft,ifft,irfft}`. `torch.fft` is also now imported by by default.

The actual `at::native` functions are still used in `torch.stft` so can't be full removed yet. But will once https://github.com/pytorch/pytorch/issues/47601 has been merged.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/48594

Reviewed By: heitorschueroff

Differential Revision: D25298929

Pulled By: mruberry

fbshipit-source-id: e36737fe8192fcd16f7e6310f8b49de478e63bf0
2020-12-05 04:12:32 -08:00
Erjia Guan
c542614e53 Implement C++ ModuleDict (#47707)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47707

Fixes #45896

Test Plan: Imported from OSS

Reviewed By: glaringlee

Differential Revision: D24872641

Pulled By: ejguan

fbshipit-source-id: 3d1dc9148ba3bcf66ab9c44ddb5774060bbc365d
2020-11-19 08:07:51 -08:00
Scott Wolchok
4c9eb57914 [PyTorch] Narrow Device to 2 bytes by narrowing DeviceType and DeviceIndex (#47023)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47023

DeviceType pretty clearly only needs 1 byte. DeviceIndex only needs 1 byte given that machines don't have anywhere near 255 GPUs in them as far as I know.
ghstack-source-id: 116901430

Test Plan: Existing tests, added assertion to catch if my assumption about DeviceIndex is incorrect

Reviewed By: dzhulgakov

Differential Revision: D24605460

fbshipit-source-id: 7c9a89027fcf8eebd623b7cdbf6302162c981cd2
2020-11-18 19:39:40 -08:00
Mike Ruberry
013e6a3d9d Revert D24698027: Fix auto exponent issue for torch.pow
Test Plan: revert-hammer

Differential Revision:
D24698027 (8ef7ccd669)

Original commit changeset: f23fdb65c925

fbshipit-source-id: 9a67a2c6310c9e4fdefbb421a8cd4fa41595bc9a
2020-11-15 03:58:44 -08:00
anjali411
8ef7ccd669 Fix auto exponent issue for torch.pow (#47024)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47024

Fixes https://github.com/pytorch/pytorch/issues/46936

Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#47024 Fix auto exponent issue for torch.pow**

Test Plan: Imported from OSS

Reviewed By: malfet

Differential Revision: D24698027

Pulled By: anjali411

fbshipit-source-id: f23fdb65c925166243593036e08214c4f041a63d
2020-11-14 22:50:12 -08:00
Jeffrey Wan
2e5bfa9824 Add input argument to autograd.backward() cpp api (#47214)
Summary:
Helps fix https://github.com/pytorch/pytorch/issues/46373 for the cpp api.

Follow up to https://github.com/pytorch/pytorch/pull/46855/ which only changed the api for python only

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47214

Reviewed By: agolynski

Differential Revision: D24716139

Pulled By: soulitzer

fbshipit-source-id: 3e1f35968e8dee132985b883481cfd0d1872ccdd
2020-11-04 14:43:59 -08:00
Nikita Shulga
c05ee86edd Fix return-type-is-always-copy warning (#47279)
Summary:
`std::vector<bool>` can not return values by reference, since they are stored as bit fields

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47279

Reviewed By: glaringlee

Differential Revision: D24705188

Pulled By: malfet

fbshipit-source-id: 96e71cc4b9881f92af3b4a508d397deab6d68174
2020-11-03 08:53:24 -08:00
Thomas Viehmann
b5a1be02a0 Add RAII DetectAnomalyGuard (#47164)
Summary:
This is a followup to the C++ anomaly detection mode, implementing the guard.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47164

Reviewed By: mruberry

Differential Revision: D24682574

Pulled By: albanD

fbshipit-source-id: b2224a56bf6eca0b90b8e10ec049cbcd5af9d108
2020-11-02 15:07:59 -08:00
Jeffrey Wan
f5073b0c5a Add inputs argument to autograd.backward() (#46855)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/46373

As noted in https://github.com/pytorch/pytorch/issues/46373, there needs to be a flag passed into the engine that indicates whether it was executed through the backward api or grad api. Tentatively named the flag `accumulate_grad` since functionally, backward api accumulates grad into .grad while grad api captures the grad and returns it.

Moving changes not necessary to the python api (cpp, torchscript) to a new PR.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46855

Reviewed By: ngimel

Differential Revision: D24649054

Pulled By: soulitzer

fbshipit-source-id: 6925d5a67d583eeb781fc7cfaec807c410e1fc65
2020-11-02 14:32:38 -08:00
Thomas Viehmann
a81572cdc5 Add anomaly mode for C++ (#46981)
Summary:
This adds anomaly mode for C++.

The backtrace isn't perfect yet, but it's a start.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46981

Reviewed By: IvanKobzarev

Differential Revision: D24631957

Pulled By: albanD

fbshipit-source-id: 4b91e205e7e51f4cf0fbc651da5013a00a3b2497
2020-10-30 15:18:07 -07:00
Xinyu Li
c9bb990707 [c++] Distance-agnostic triplet margin loss (#45377)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45377

This PR adds a C++ implementation of the TripletMarginWithDistanceLoss, for which the Python implementation was introduced in PR #43680.  It's based on PR #44072, but I'm resubmitting this to unlink it from Phabricator.

Test Plan: Imported from OSS

Reviewed By: izdeby

Differential Revision: D24003973

fbshipit-source-id: 2d9ada7260a6f27425ff2fdbbf623dad0fb79405
2020-09-30 12:37:35 -07:00
Brian Hirsh
439930c81b adding a beta parameter to the smooth_l1 loss fn (#44433)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44433

Not entirely sure why, but changing the type of beta from `float` to `double in autocast_mode.cpp and FunctionsManual.h fixes my compiler errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

Test Plan: Imported from OSS

Reviewed By: mrshenli

Differential Revision: D23636720

Pulled By: bdhirsh

fbshipit-source-id: caea2a1f8dd72b3b5fd1d72dd886b2fcd690af6d
2020-09-25 16:36:28 -07:00
Peter Bell
da7863f46b Add one dimensional FFTs to torch.fft namespace (#43011)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43011

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D23751850

Pulled By: mruberry

fbshipit-source-id: 8dc5fec75102d8809eeb85a3d347ba1b5de45b33
2020-09-19 23:32:22 -07:00
lixinyu
77cc7d1ecd C++ APIs Transformer NN Module Top Layer (#44333)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44333

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D23584010

Pulled By: glaringlee

fbshipit-source-id: 990026e3f1b5ae276776e344ea981386cb7528fe
2020-09-11 08:25:27 -07:00
generatedunixname89002005287564@sandcastle1415.cln1.facebook.com
1dd658f28f [Codemod][GleanFbcode] Remove dead includes in caffe2/test (#43953)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43953

Reviewed By: malfet

Differential Revision: D23445556

fbshipit-source-id: 89cd6833aa06f35c5d3c99d698abb08cd61ae4ab
2020-09-01 21:48:28 -07:00
Vinod Kumar S
13c7c6227e Python/C++ API Parity: TransformerDecoder (#42886)
Summary:
Fixes #{[37756](https://github.com/pytorch/pytorch/issues/37756)}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42886

Reviewed By: zhangguanheng66

Differential Revision: D23385631

Pulled By: glaringlee

fbshipit-source-id: 610a2fabb4c25b2dfd37b33287215bb8872d653d
2020-08-28 20:13:53 -07:00
Mike Ruberry
f4695203c2 Fixes fft function calls for C++ API (#43749)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/43732.

Requires importing the fft namespace in the C++ API, just like the Python API does, to avoid clobbering torch::fft the function.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43749

Reviewed By: glaringlee

Differential Revision: D23391544

Pulled By: mruberry

fbshipit-source-id: d477d0b6d9a689d5c154ad6c31213a7d96fdf271
2020-08-28 12:41:30 -07:00
lixinyu
48e08f884e C++ APIs TransformerEncoder (#43187)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43187

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D23182770

Pulled By: glaringlee

fbshipit-source-id: 968846138d4b1c391a74277216111dba8b72d683
2020-08-27 01:31:46 -07:00
lixinyu
e32d014f46 remove empty override pretty_print (#43341)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43341

This is to remove the empty pretty_print() since it overrides the impl within Module base which is not as designed here.

Test Plan: Imported from OSS

Reviewed By: pbelevich

Differential Revision: D23244616

Pulled By: glaringlee

fbshipit-source-id: 94b8dfd3697dfc450f53b3b4eee6e9c13cafba7b
2020-08-20 18:48:29 -07:00
lixinyu
269fdb5bb2 prepare to split transformer header file (#43069)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43069

The transformer c++ impl need to put TransformerEncoderLayer/DecoderLayer and TransformerEncoder/TransformerDecoder in different header since TransformerEncoder/Decoder's options class need TransformerEncoderLayer/DecoderLayer as input parameter. Split header files to avoid cycle includsion.

Test Plan: Imported from OSS

Reviewed By: yf225

Differential Revision: D23139437

Pulled By: glaringlee

fbshipit-source-id: 3c752ed7702ba18a9742e4d47d049e62d2813de0
2020-08-17 07:54:05 -07:00
Heitor Schueroff de Souza
3d8c144400 Implemented torch::nn::Unflatten in libtorch (#42613)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42613

Test Plan: Imported from OSS

Reviewed By: glaringlee

Differential Revision: D23030302

Pulled By: heitorschueroff

fbshipit-source-id: 954f1cdfcbd3a62a7f0e887fcf5995ef27222a87
2020-08-14 15:32:13 -07:00
Vinod Kumar S
830423b80b Python/C++ API Parity: TransformerDecoderLayer (#42717)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/37756

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42717

Reviewed By: zhangguanheng66

Differential Revision: D23095841

Pulled By: glaringlee

fbshipit-source-id: 327a5a23c9a3cca05e422666a6d7d802a7e8c468
2020-08-13 20:31:13 -07:00
Heitor Schueroff de Souza
ffc3da35f4 Don't materialize output grads (#41821)
Summary:
Added a new option in AutogradContext to tell autograd to not materialize output grad tensors, that is, don't expand undefined/None tensors into tensors full of zeros before passing them as input to the backward function.

This PR is the second part that closes https://github.com/pytorch/pytorch/issues/41359. The first PR is https://github.com/pytorch/pytorch/pull/41490.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41821

Reviewed By: albanD

Differential Revision: D22693163

Pulled By: heitorschueroff

fbshipit-source-id: a8d060405a17ab1280a8506a06a2bbd85cb86461
2020-08-11 04:27:07 -07:00
lixinyu
98de150381 C++ API TransformerEncoderLayer (#42633)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42633

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D22994332

Pulled By: glaringlee

fbshipit-source-id: 873abdf887d135fb05bde560d695e2e8c992c946
2020-08-07 11:49:42 -07:00
Mike Ruberry
ccfce9d4a9 Adds fft namespace (#41911)
Summary:
This PR creates a new namespace, torch.fft (torch::fft) and puts a single function, fft, in it. This function is analogous to is a simplified version of NumPy's [numpy.fft.fft](https://numpy.org/doc/1.18/reference/generated/numpy.fft.fft.html?highlight=fft#numpy.fft.fft) that accepts no optional arguments. It is intended to demonstrate how to add and document functions in the namespace, and is not intended to deprecate the existing torch.fft function.

Adding this namespace was complicated by the existence of the torch.fft function in Python. Creating a torch.fft Python module makes this name ambiguous: does it refer to a function or module? If the JIT didn't exist, a solution to this problem would have been to make torch.fft refer to a callable class that mimicked both the function and module. The JIT, however, cannot understand this pattern. As a workaround it's required to explicitly `import torch.fft` to access the torch.fft.fft function in Python:

```
import torch.fft

t = torch.randn(128, dtype=torch.cdouble)
torch.fft.fft(t)
```

See https://github.com/pytorch/pytorch/issues/42175 for future work. Another possible future PR is to get the JIT to understand torch.fft as a callable class so it need not be imported explicitly to be used.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41911

Reviewed By: glaringlee

Differential Revision: D22941894

Pulled By: mruberry

fbshipit-source-id: c8e0b44cbe90d21e998ca3832cf3a533f28dbe8d
2020-08-06 00:20:50 -07:00
Kurt Mohler
df7c059428 Throw error if torch.set_deterministic(True) is called with nondeterministic CuBLAS config (#41377)
Summary:
For CUDA >= 10.2, the `CUBLAS_WORKSPACE_CONFIG` environment variable must be set to either `:4096:8` or `:16:8` to ensure deterministic CUDA stream usage. This PR adds some logic inside `torch.set_deterministic()` to raise an error if this environment variable is not set properly and CUDA >= 10.2.

Issue https://github.com/pytorch/pytorch/issues/15359

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41377

Reviewed By: malfet

Differential Revision: D22758459

Pulled By: ezyang

fbshipit-source-id: 4b96f1e9abf85d94ba79140fd927bbd0c05c4522
2020-08-05 12:42:24 -07:00
Yujun Zhao
0444bac940 Add test to cross function
Summary: function `cross_kernel_scalar` is not covered in `Aten/native/cpu/CrossKernel.cpp`, add tests to cover it

Test Plan:
1. Test locally to check new lines are covered
2. CI

https://pxl.cl/1fZjG

Reviewed By: malfet

Differential Revision: D22834122

fbshipit-source-id: 0d50f3a3e6aee52cb6fdee2b9f5883f542c7b6e2
2020-07-29 22:48:52 -07:00
Yujun Zhao
9ea7476d9c Add test to lerp function (#42266)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42266

function `lerp_kernel_scalar` and `lerp_kernel_tensor` are not covered in `Aten/native/cpu/LerpKernel.cpp`, add tests to cover them

Test Plan:
1. Test locally to check new lines are covered
2. CI

https://pxl.cl/1fXPd

Reviewed By: malfet

Differential Revision: D22832164

fbshipit-source-id: b1eaabbf8bfa08b4dedc1a468abfdfb619a50e3c
2020-07-29 22:47:37 -07:00
lixinyu
5246bc4e87 register parameters correctly in c++ MultiheadAttention (#42037)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42037

This is to fix #41951

Test Plan: Imported from OSS

Reviewed By: yf225

Differential Revision: D22764717

Pulled By: glaringlee

fbshipit-source-id: e6da0aeb05a2356f52446e6d5fad391f2cd1cf6f
2020-07-27 13:58:11 -07:00
Heitor Schueroff de Souza
cf811d2fb3 retain undefined tensors in backward pass (#41490)
Summary:
Leave undefined tensors / None returned from custom backward functions as undefined/None instead of creating a tensor full of zeros. This change improves performance in some cases.

**This is BC-Breaking:** Custom backward functions that return None will now see it potentially being propagated all the way up to AccumulateGrad nodes. Potential impact is that .grad field of leaf tensors as well as the result of autograd.grad may be undefined/None where it used to be a tensor full of zeros. Also, autograd.grad may raise an error, if so, consider using allow_unused=True ([see doc](https://pytorch.org/docs/stable/autograd.html?highlight=autograd%20grad#torch.autograd.grad)) if it applies to your case.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41490

Reviewed By: albanD

Differential Revision: D22578241

Pulled By: heitorschueroff

fbshipit-source-id: f4966f4cb520069294f8c5c1691eeea799cc0abe
2020-07-17 12:42:50 -07:00
albanD
45c5bac870 [WIP] Fix cpp grad accessor API (#40887)
Summary:
Update the API to access grad in cpp to avoid unexpected thread safety issues.
In particular, with the current API, a check like `t.grad().defined()` is not thread safe.

- This introduces `t.mutable_grad()` that should be used when getting a mutable version of the saved gradient. This function is **not** thread safe.
- The `Tensor& grad()` API is now removed. We could not do a deprecation cycle as most of our call side use non-const Tensors that use the non-const overload. This would lead to most calls hitting the warning. This would be too verbose for all the users.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/40887

Reviewed By: ezyang

Differential Revision: D22343932

Pulled By: albanD

fbshipit-source-id: d5eb909bb743bc20caaf2098196e18ca4110c5d2
2020-07-16 09:11:12 -07:00
yyn19951228
98df9781a7 Impl for ParameterList (#41259)
Summary:
This is a new PR for https://github.com/pytorch/pytorch/issues/40850, https://github.com/pytorch/pytorch/issues/40987 and https://github.com/pytorch/pytorch/issues/41206(I unintentionally closed), as I have some issues for rebates for that one. Very sorry about that. And I have fixed the tests failed in that PR.

This diff contains the implementation of C++ API for ParameterList from https://github.com/pytorch/pytorch/issues/25883.
Refer to the Python API: bc9e8af218/torch/nn/modules/container.py (L376)
Not sure about some naming difference between C++ API and Python API, like `append`, should it be called `push_back`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41259

Test Plan: Add unit tests in this diff

Differential Revision: D22495780

Pulled By: glaringlee

fbshipit-source-id: 79ea3592db640f35477d445ecdaeafbdad814bec
2020-07-12 20:50:31 -07:00
Sebastian Messmer
9daba76ba1 Change to.dtype_layout to c10-full (#41169)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41169

-
ghstack-source-id: 107537240

Test Plan: waitforsandcastle

Differential Revision: D22289257

fbshipit-source-id: ed3cc06327951fa886eb3b8f1c8bcc014ae2bc41
2020-07-10 16:04:34 -07:00
yyn19951228
4121d34036 Python/C++ API Parity: Add impl and tests for ParameterDict (#40654)
Summary:
This diff contains the implementation of C++ api for ParameterDict from https://github.com/pytorch/pytorch/issues/25883, refer to  https://github.com/pytorch/pytorch/issues/36904 and https://github.com/pytorch/pytorch/issues/28652
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40654

Test Plan: Add unit test in this diff

Differential Revision: D22273265

Pulled By: glaringlee

fbshipit-source-id: 9134a92c95eacdd53d5b24470d5f7edbeb40a488
2020-06-29 08:50:44 -07:00
Peter Bell
3dcc329746 Use tree-based sum for floats to avoid numerical instability (#39516)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/38716, fixes https://github.com/pytorch/pytorch/issues/37234

This algorithm does the summation along a single axis with multiple "levels" of accumulator, each of which is designed to hold the sum of an order of magnitude more values than the previous.

e.g. if there are 2^16 elements, the first level will hold the sum of 2^4 elements, and so on in increasing powers of 2: 2^4, 2^8, 2^12 and finally 2^16.

This limits the differences in magnitude of the partial results being added together, and so we don't lose accuracy as the axis length increases.

WIP to write a vectorized version.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39516

Reviewed By: ezyang

Differential Revision: D22106251

Pulled By: ngimel

fbshipit-source-id: b56de4773292439dbda62b91f44ff37715850ae9
2020-06-24 17:06:38 -07:00
Peter Bell
16f276cef9 Add C++-only int dim overloads to std-related operations (#40451)
Summary:
Fixes gh-40287

The `int -> bool` conversion takes higher precedence than `int -> IntArrayRef`. So, calling `std(0)` in C++ would select the `std(unbiased=False)` overload instead.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40451

Differential Revision: D22217926

Pulled By: ezyang

fbshipit-source-id: 7520792fab5ab6665bddd03b6f57444c6c729af4
2020-06-24 16:56:55 -07:00
Mike Ruberry
cb26661fe4 Throws runtime error when torch.full would infer a float dtype from a bool or integral fill value (#40364)
Summary:
BC-breaking NOTE:

In PyTorch 1.6 bool and integral fill values given to torch.full must set the dtype our out keyword arguments. In prior versions of PyTorch these fill values would return float tensors by default, but in PyTorch 1.7 they will return a bool or long tensor, respectively. The documentation for torch.full has been updated to reflect this.

PR NOTE:

This PR causes torch.full to throw a runtime error when it would have inferred a float dtype by being given a boolean or integer value. A versioned symbol for torch.full is added to preserve the behavior of already serialized Torchscript programs. Existing tests for this behavior being deprecated have been updated to reflect it now being unsupported, and a couple new tests have been added to validate the versioned symbol behavior. The documentation of torch.full has also been updated to reflect this change.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40364

Differential Revision: D22176640

Pulled By: mruberry

fbshipit-source-id: b20158ebbcb4f6bf269d05a688bcf4f6c853a965
2020-06-23 23:27:22 -07:00