Commit Graph

108 Commits

Author SHA1 Message Date
Thomas J. Fan
d3bcba5f85 ENH Adds label_smoothing to cross entropy loss (#63122)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/7455

Partially resolves pytorch/vision#4281

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63122

Reviewed By: iramazanli

Differential Revision: D30586076

Pulled By: jbschlosser

fbshipit-source-id: 06afc3aa1f8b9edb07fe9ed68c58968ad1926924
2021-08-29 23:33:04 -07:00
Xiang Gao
227cb268bc [Reland] Embedding thrust->cub migration (#63806)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/63427

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63806

Reviewed By: bdhirsh

Differential Revision: D30498255

Pulled By: ngimel

fbshipit-source-id: 78b7085a92a168cf0163f53dcb712bac922f5235
2021-08-24 09:30:32 -07:00
Thomas J. Fan
2ca2761f3c ENH Adds no_batch_dim for NLLLoss (#62651)
Summary:
Towards https://github.com/pytorch/pytorch/issues/60585

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62651

Reviewed By: VitalyFedyunin

Differential Revision: D30303340

Pulled By: jbschlosser

fbshipit-source-id: 7ab478cf63bf6cd1f850cad5fd101e74a2cfe3f5
2021-08-24 08:27:27 -07:00
Thomas J. Fan
9914fb6615 ENH Adds no_batch_dim tests/docs for LPPool1d and Identity (#62190)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/60585

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62190

Reviewed By: ejguan

Differential Revision: D29942385

Pulled By: jbschlosser

fbshipit-source-id: 00df6f6f01ad039631bb8679f8de94863aac7650
2021-08-24 06:59:41 -07:00
Thomas J. Fan
c5f3ab6982 ENH Adds no_batch_dim to FractionalMaxPool2d (#62490)
Summary:
Towards https://github.com/pytorch/pytorch/issues/60585

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62490

Reviewed By: bdhirsh

Differential Revision: D30287143

Pulled By: jbschlosser

fbshipit-source-id: 1b9dd932157f571adf3aa2c98c3c6b56ece8fa6e
2021-08-13 08:48:40 -07:00
Thomas J. Fan
59d09b148c BUG Fixes bug in no_batch_dim tests (#62726)
Summary:
The way that Python captures variables for lambdas meant that only the last `input_fn`, etc were captured. This PR adds makes sure the local variable to captured by a lambda.

REF: https://docs.python.org/3/faq/programming.html#why-do-lambdas-defined-in-a-loop-with-different-values-all-return-the-same-result

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62726

Reviewed By: zou3519

Differential Revision: D30159478

Pulled By: jbschlosser

fbshipit-source-id: cfef3d9776d2676b2f5bb6d39d569b8ca07b0fe5
2021-08-06 11:11:25 -07:00
Joel Schlosser
a42345adee Support for target with class probs in CrossEntropyLoss (#61044)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/11959

Alternative approach to creating a new `CrossEntropyLossWithSoftLabels` class. This PR simply adds support for "soft targets" AKA class probabilities to the existing `CrossEntropyLoss` and `NLLLoss` classes.

Implementation is dumb and simple right now, but future work can add higher performance kernels for this case.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61044

Reviewed By: zou3519

Differential Revision: D29876894

Pulled By: jbschlosser

fbshipit-source-id: 75629abd432284e10d4640173bc1b9be3c52af00
2021-07-29 10:04:41 -07:00
Thomas J. Fan
80a662e773 ENH Updates docs and tests for classification modules that already support no batch dims (#61874)
Summary:
Towards https://github.com/pytorch/pytorch/issues/60585

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61874

Reviewed By: heitorschueroff

Differential Revision: D29979977

Pulled By: jbschlosser

fbshipit-source-id: 82c19151aa7220e564337b05d7677d52981e0aa2
2021-07-29 09:14:52 -07:00
Joel Schlosser
35307b131d Callable activation function support for Transformer modules (Python) (#61355)
Summary:
Fixes Python part of https://github.com/pytorch/pytorch/issues/60747

Enhances the Python versions of `Transformer`, `TransformerEncoderLayer`, and `TransformerDecoderLayer` to support callables as their activation functions. The old way of specifying activation function still works as well.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61355

Reviewed By: bdhirsh

Differential Revision: D29967302

Pulled By: jbschlosser

fbshipit-source-id: 8ee6f20083d49dcd3ab432a18e6ad64fe1e05705
2021-07-28 21:42:56 -07:00
Thomas J. Fan
7c588d5d00 ENH Adds no_batch_dim support for pad 2d and 3d (#62183)
Summary:
Towards https://github.com/pytorch/pytorch/issues/60585

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62183

Reviewed By: ejguan

Differential Revision: D29942250

Pulled By: jbschlosser

fbshipit-source-id: d1df4ddcb90969332dc1a2a7937e66ecf46f0443
2021-07-28 11:10:44 -07:00
Thomas J. Fan
89ca638c18 ENH Adds no batch dim support for AdativeMaxPool*D (#61847)
Summary:
Towards https://github.com/pytorch/pytorch/issues/60585

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61847

Reviewed By: suo

Differential Revision: D29883887

Pulled By: jbschlosser

fbshipit-source-id: de3fcf1cc3878b138ab766d2a50cc59c52ec5a60
2021-07-26 07:35:36 -07:00
Thomas J. Fan
f03e7170f0 ENH Updates docs and tests for regression modules that already support no-batch-dims (#61461)
Summary:
Towards https://github.com/pytorch/pytorch/issues/60585

This PR does not use `check_sum_reduction` because I wanted to test every reduction option.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61461

Reviewed By: suo

Differential Revision: D29883744

Pulled By: jbschlosser

fbshipit-source-id: cdad0effb41f0484938caad0d4c9d6d83e2aec07
2021-07-23 16:40:17 -07:00
Thomas J. Fan
0309c5780d ENH Adds no batch dim support for AvgPool1d (#61860)
Summary:
Towards https://github.com/pytorch/pytorch/issues/60585

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61860

Reviewed By: albanD

Differential Revision: D29826382

Pulled By: jbschlosser

fbshipit-source-id: 47e12073d866f0604310fc1ff270cde9907e516d
2021-07-22 12:46:48 -07:00
Thomas J. Fan
48af9de92f ENH Enables No-batch for *Pad1d Modules (#61060)
Summary:
Toward https://github.com/pytorch/pytorch/issues/60585

This PR adds a `single_batch_reference_fn` that uses the single batch implementation to check no-batch.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61060

Reviewed By: mrshenli

Differential Revision: D29739823

Pulled By: jbschlosser

fbshipit-source-id: d90d88a3671177a647171801cc6ec7aa3df35482
2021-07-21 07:12:41 -07:00
soulitzer
1b0a7f3887 Always use fast gradcheck for LayerNorm 3d_no_affine_large_feature (#61848)
Summary:
Due to the introduction of a test from https://github.com/pytorch/pytorch/pull/59987/files, slow gradcheck has been failing intermittently (timing out/getting killed).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61848

Reviewed By: mrshenli

Differential Revision: D29765773

Pulled By: soulitzer

fbshipit-source-id: d78bee758cab76f26ba9f54925c42d4825db9449
2021-07-19 12:33:55 -07:00
Thomas J. Fan
24a6eb3fda ENH Adds tests and docs for 2d & 3d modules that already support no batch (#61262)
Summary:
Toward https://github.com/pytorch/pytorch/issues/60585

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61262

Reviewed By: mrshenli

Differential Revision: D29660554

Pulled By: jbschlosser

fbshipit-source-id: d5e3dc7096fcf8621bce4a1063d521b84092e0ca
2021-07-19 11:12:28 -07:00
Thomas J. Fan
dc0d1612e1 ENH Updates docs and tests for activation modules for no-batch dims (#61300)
Summary:
Towards https://github.com/pytorch/pytorch/issues/60585

This PR updates docs and tests for activation modules that already support no-batch dims.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61300

Reviewed By: heitorschueroff

Differential Revision: D29660543

Pulled By: jbschlosser

fbshipit-source-id: 5edad45f7e9995aca6c3403469668e6e1cbb94b6
2021-07-16 14:42:18 -07:00
Thomas J. Fan
25a705610f ENH Adds support for no-batch dim in AdaptiveAvgPool1d (#61264)
Summary:
Towards https://github.com/pytorch/pytorch/issues/60585

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61264

Reviewed By: iramazanli

Differential Revision: D29615292

Pulled By: jbschlosser

fbshipit-source-id: 826d1c87d67261a7211270e90e3a1022bbbe37bd
2021-07-12 10:24:37 -07:00
Thomas J. Fan
87dbdef65d MAINT Adds test and docs for Linear with no batch dims (#60992)
Summary:
Towards https://github.com/pytorch/pytorch/issues/60585

This PR updates docs for `Linear` and adds a non-batch test case to `common_nn.py`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/60992

Reviewed By: VitalyFedyunin

Differential Revision: D29518451

Pulled By: jbschlosser

fbshipit-source-id: 6dd79c0f21ac5b6f693e3e1ba954379d2606d4e0
2021-07-01 14:49:24 -07:00
Xiaomeng Yang
963c983366 Improve numerical stability of LayerNorm (#59987)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59987

Similar as GroupNorm, improve numerical stability of LayerNorm by Welford algorithm and pairwise sum.

Test Plan: buck test mode/dev-nosan //caffe2/test:nn -- "LayerNorm"

Reviewed By: ngimel

Differential Revision: D29115235

fbshipit-source-id: 5183346c3c535f809ec7d98b8bdf6d8914bfe790
2021-06-25 02:22:42 -07:00
Natalia Gimelshein
ddec2e0ef4 tentative fix for adaptiveavgpool gradient computation (#60630)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/60524

Pull Request resolved: https://github.com/pytorch/pytorch/pull/60630

Reviewed By: jbschlosser

Differential Revision: D29374257

Pulled By: ngimel

fbshipit-source-id: be05f0ceb53e6f0f0a59a83b710dafde469d4e8a
2021-06-24 19:02:32 -07:00
Alexander Grund
0ba4044b9d Increase some tolerances for tf32 for Conv3d tests (#60451)
Summary:
Allow those tests to pass on A100 GPUs which support tf32

Basically follow-up to https://github.com/pytorch/pytorch/pull/52871 which also increased some precisions to 0.05

For reference these are the failures I see (only ones in testnn with 1.9.0):
```
FAIL: test_Conv3d_pad_same_cuda_tf32 (__main__.TestNN)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/easybuild-tmp/eb-ED4 (1f47a80e88)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 1033, in wrapper
    method(*args, **kwargs)
  File "/tmp/easybuild-tmp/eb-ED4 (1f47a80e88)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 1033, in wrapper
    method(*args, **kwargs)
  File "test_nn.py", line 11296, in with_tf32_on
    test.test_cuda(self, **kwargs)
  File "/tmp/easybuild-tmp/eb-ED4 (1f47a80e88)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_nn.py", line 5103, in test_cuda
    test_case.assertEqualIgnoreType(cpu_d_i, gpu_d_i, atol=self.precision, rtol=0)
  File "/tmp/easybuild-tmp/eb-ED4 (1f47a80e88)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 1254, in assertEqualIgnoreType
    return self.assertEqual(*args, exact_dtype=False, **kwargs)
  File "/tmp/easybuild-tmp/eb-ED4 (1f47a80e88)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 1355, in assertEqual
    super().assertTrue(result, msg=self._get_assert_msg(msg, debug_msg=debug_msg))
AssertionError: False is not true : Tensors failed to compare as equal!With rtol=0 and atol=0.005, found 161 element(s) (out of 288) whose difference(s) exceeded the margin of error (including 0 nan compariso
ns). The greatest difference was 0.032408137116391345 (-33.45570601919647 vs. -33.42329788208008), which occurred at index (2, 0, 0, 1, 0).

======================================================================
FAIL: test_Conv3d_pad_same_dilated_cuda_tf32 (__main__.TestNN)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/easybuild-tmp/eb-ED4 (1f47a80e88)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 1033, in wrapper
    method(*args, **kwargs)
  File "/tmp/easybuild-tmp/eb-ED4 (1f47a80e88)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 1033, in wrapper
    method(*args, **kwargs)
  File "test_nn.py", line 11296, in with_tf32_on
    test.test_cuda(self, **kwargs)
  File "/tmp/easybuild-tmp/eb-ED4 (1f47a80e88)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_nn.py", line 5103, in test_cuda
    test_case.assertEqualIgnoreType(cpu_d_i, gpu_d_i, atol=self.precision, rtol=0)
  File "/tmp/easybuild-tmp/eb-ED4 (1f47a80e88)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 1254, in assertEqualIgnoreType
    return self.assertEqual(*args, exact_dtype=False, **kwargs)
  File "/tmp/easybuild-tmp/eb-ED4 (1f47a80e88)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 1355, in assertEqual
    super().assertTrue(result, msg=self._get_assert_msg(msg, debug_msg=debug_msg))
AssertionError: False is not true : Tensors failed to compare as equal!With rtol=0 and atol=0.005, found 111 element(s) (out of 288) whose difference(s) exceeded the margin of error (including 0 nan compariso
ns). The greatest difference was 0.024654212557543076 (35.104286017977465 vs. 35.07963180541992), which occurred at index (3, 0, 0, 0, 2).

======================================================================
FAIL: test_Conv3d_pad_valid_cuda_tf32 (__main__.TestNN)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/easybuild-tmp/eb-ED4 (1f47a80e88)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 1033, in wrapper
    method(*args, **kwargs)
  File "/tmp/easybuild-tmp/eb-ED4 (1f47a80e88)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 1033, in wrapper
    method(*args, **kwargs)
  File "test_nn.py", line 11296, in with_tf32_on
    test.test_cuda(self, **kwargs)
  File "/tmp/easybuild-tmp/eb-ED4 (1f47a80e88)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_nn.py", line 5103, in test_cuda
    test_case.assertEqualIgnoreType(cpu_d_i, gpu_d_i, atol=self.precision, rtol=0)
  File "/tmp/easybuild-tmp/eb-ED4 (1f47a80e88)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 1254, in assertEqualIgnoreType
    return self.assertEqual(*args, exact_dtype=False, **kwargs)
  File "/tmp/easybuild-tmp/eb-ED4 (1f47a80e88)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 1355, in assertEqual
    super().assertTrue(result, msg=self._get_assert_msg(msg, debug_msg=debug_msg))
AssertionError: False is not true : Tensors failed to compare as equal!With rtol=0 and atol=0.005, found 41 element(s) (out of 288) whose difference(s) exceeded the margin of error (including 0 nan comparisons). The greatest difference was 0.010903167642320355 (8.074376869119371 vs. 8.06347370147705), which occurred at index (0, 0, 1, 0, 0).

```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/60451

Reviewed By: albanD

Differential Revision: D29353255

Pulled By: ngimel

fbshipit-source-id: 155a02242be5a11dcbd9dd40ab63f15c6757ae1b
2021-06-24 13:36:27 -07:00
Thomas J. Fan
c16f87949f ENH Adds nn.ReflectionPad3d (#59791)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/27655

This PR adds a C++ and Python version of ReflectionPad3d with structured kernels. The implementation uses lambdas extensively to better share code from the backward and forward pass.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59791

Reviewed By: gchanan

Differential Revision: D29242015

Pulled By: jbschlosser

fbshipit-source-id: 18e692d3b49b74082be09f373fc95fb7891e1b56
2021-06-21 10:53:14 -07:00
Eddie Yan
3870e68644 TF32 threshold twiddling for tests (#60209)
Summary:
Following https://github.com/pytorch/pytorch/issues/59624 I observed some straggling failing tests on Ampere due to TF32 thresholds. This PR just twiddles some more thresholds to fix the (6) failing tests I saw on A100.

CC Flamefire ptrblck ngimel

Pull Request resolved: https://github.com/pytorch/pytorch/pull/60209

Reviewed By: gchanan

Differential Revision: D29220508

Pulled By: ngimel

fbshipit-source-id: 7c83187a246e1b3a24b181334117c0ccf2baf311
2021-06-18 11:41:33 -07:00
Xiaomeng Yang
ff15d93b88 Improve numerical stability of GroupNorm (#54921)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54921

Improve numerical stability of GroupNorm

Test Plan: buck test mode/dev-nosan //caffe2/test:nn -- "GroupNorm"

Reviewed By: ngimel

Differential Revision: D27414438

fbshipit-source-id: 815517240ca5ea3e2beb77ced3bd862e9c83d445
2021-06-13 16:13:32 -07:00
Adnios
09a8f22bf9 Add mish activation function (#58648)
Summary:
See issus: https://github.com/pytorch/pytorch/issues/58375

Pull Request resolved: https://github.com/pytorch/pytorch/pull/58648

Reviewed By: gchanan

Differential Revision: D28625390

Pulled By: jbschlosser

fbshipit-source-id: 23ea2eb7d5b3dc89c6809ff6581b90ee742149f4
2021-05-25 10:36:21 -07:00
Jeffrey Wan
2b54cec7e8 Clean up naming and comments (#56964)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56964

This PR does many things but does not update any logic:
 - Prefixes all function names that are not `gradcheck`, `gradgradcheck`, `get_numerical_jacobian`, and `get_analytical_jacobian` with underscore to indicate that they aren't part of the public API (https://github.com/pytorch/pytorch/issues/55714).
 - Improve naming to avoid referencing Jacobian rows or Jacobian cols when we really mean vjp and jvp as suggested by zou3519
 - Try to reduce comment line length so they are more consistent and easier to read
 - Other misc improvements to documentaiton

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D28096571

Pulled By: soulitzer

fbshipit-source-id: d372b5f8ee080669e525a987402ded72810baa0c
2021-04-30 17:40:14 -07:00
Kurt Mohler
1f04494c0e Consolidate nondeterministic error tests (#55631)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/51498

Pull Request resolved: https://github.com/pytorch/pytorch/pull/55631

Reviewed By: malfet

Differential Revision: D27909953

Pulled By: mruberry

fbshipit-source-id: 9115b2433f9c276555be55bd51b270a7a2846829
2021-04-22 23:37:01 -07:00
Horace He
3c4e1cd141 remove annoying warnings from common_nn.py (#55982)
Summary:
^^

Pull Request resolved: https://github.com/pytorch/pytorch/pull/55982

Reviewed By: mruberry

Differential Revision: D27776380

Pulled By: Chillee

fbshipit-source-id: 22b3a8de73416821bed56b75b68dca1c33a21250
2021-04-15 16:03:00 -07:00
Kurt Mohler
3fe4718d16 Add padding_idx argument to EmbeddingBag (#49237)
Summary:
This PR adds a `padding_idx` parameter to `nn.EmbeddingBag` and `nn.functional.embedding_bag`. As with `nn.Embedding`'s `padding_idx` argument, if an embedding's index is equal to `padding_idx` it is ignored, so it is not included in the reduction.

This PR does not add support for `padding_idx` for quantized or ONNX `EmbeddingBag` for opset10/11 (opset9 is supported). In these cases, an error is thrown if `padding_idx` is provided.

Fixes https://github.com/pytorch/pytorch/issues/3194

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49237

Reviewed By: walterddr, VitalyFedyunin

Differential Revision: D26948258

Pulled By: jbschlosser

fbshipit-source-id: 3ca672f7e768941f3261ab405fc7597c97ce3dfc
2021-04-14 09:38:01 -07:00
Jeffrey Wan
381b3d8f4b Refactor get numerical jacobian to calculate wrt all outputs at once (#54378)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54378

### For release notes
`torch.autograd.gradcheck.get_numerical_jacobian` (not part of the public api) is being deprecated.

In the future, user code relying on this function will break because, among other changes, `get_numerical_jacobian` now returns `List[Tuple[torch.Tensor]]` instead of `List[torch.Tensor]`.

(more details if necessary)
For a `fn` that takes in M inputs and N outputs we now return a list of M N-tuples of jacobians where `output[i][j]` would represent the numerical jacobian w.r.t. to the ith input and the jth output. Previously `get_numerical_jacobian` returned a list of tensors where each tensor represents the jacobian w.r.t. to each of the M inputs and a specific output. Finally, the function passed in as the parameter `fn` should expect to handle individual parameters, where previously `fn` is required to expect its parameters wrapped in a tuple.

 --- end --

This PR addresses the comment here https://github.com/pytorch/pytorch/pull/53857#discussion_r595429639, to reduce the run-time of old gradcheck's get numerical jacobian by a factor of num_outputs. However, because very few ops actually return multiple outputs, there is not too much real speed up here.

The main benefit of doing this change as part of the refactor is that it helps us isolate the possible bugs that are specific to switching `get numerical jacobian` to run in a per output way vs all outputs at once. Much of the logic implemented here will be the same for the fast gradcheck case, so knowing for certain that everything should pass after this stage will make the next step much simpler.

The get_numerical_jacobian api is also being used in common_nn. So we update the callsite there as well.

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D27728720

Pulled By: soulitzer

fbshipit-source-id: ee0f90b4f26ddc5fdbe949c4965eaa91c9ed0bb8
2021-04-13 10:06:20 -07:00
Yu Guo
846c8d94c7 mark embedding backward non-deterministic for max mode rather than all reducing modes (#55574)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55574

nn.EmbeddingBag backward is non-deterministic when reducing_mode = Max and on GPU, reducing mode Mean and Sum should be deterministic

Test Plan: NA

Reviewed By: ngimel

Differential Revision: D27633832

fbshipit-source-id: 50786ed8522f1aae27442f5f244a65eab8000b06
2021-04-09 16:19:01 -07:00
Bel H
645119eaef Lowering NLLLoss/CrossEntropyLoss to ATen code (#53789)
Summary:
* Lowering NLLLoss/CrossEntropyLoss to ATen dispatch
* This allows the MLC device to override these ops
* Reduce code duplication between the Python and C++ APIs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/53789

Reviewed By: ailzhang

Differential Revision: D27345793

Pulled By: albanD

fbshipit-source-id: 99c0d617ed5e7ee8f27f7a495a25ab4158d9aad6
2021-03-26 07:31:08 -07:00
Xiang Gao
9f336bdf10 Fixes new tf32 failures in test_nn.py (#52871)
Summary:
Also modify the `tf32_on_and_off` decorator to make it support function without `device` argument.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/52871

Reviewed By: ngimel

Differential Revision: D27286674

Pulled By: mruberry

fbshipit-source-id: 14f6d558271bd6a1d0bc40691c170d47e81de1ff
2021-03-24 21:53:33 -07:00
Yukio Siraichi
27048c1dfa Remove legacy constructor calls from _torch_ folder. (#53889)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/53146
Related to https://github.com/pytorch/pytorch/issues/47112

As mentioned in https://github.com/pytorch/pytorch/issues/47112, the plan is to:

1. Verify that all `torch.Tensor()` scenarios are covered by other functions
2. Scrub internal `torch.Tensor()` uses
3. Update the docs and throw `TORCH_WARN_ONCE` if someone uses `torch.Tensor()`

In this PR, I replaced all occurrences of `torch.Tensor` present in the _torch_ folder.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/53889

Reviewed By: walterddr, zou3519

Differential Revision: D27190743

Pulled By: jbschlosser

fbshipit-source-id: 7ecc201d57935b8dbb98ae3718b60d95cb55a010
2021-03-19 15:20:19 -07:00
Peter Bell
04e0cbf5a9 Add padding='same' mode to conv{1,2,3}d (#45667)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45667

First part of #3867 (Pooling operators still to do)

This adds a `padding='same'` mode to the interface of `conv{n}d`and `nn.Conv{n}d`. This should match the behaviour of `tensorflow`. I couldn't find it explicitly documented but through experimentation I found `tensorflow` returns the shape `ceil(len/stride)` and always adds any extra asymmetric padding onto the right side of the input.

Since the `native_functions.yaml` schema doesn't seem to support strings or enums, I've moved the function interface into python and it now dispatches between the numerically padded `conv{n}d` and the `_conv{n}d_same` variant. Underscores because I couldn't see any way to avoid exporting a function into the `torch` namespace.

A note on asymmetric padding. The total padding required can be odd if both the kernel-length is even  and the dilation is odd. mkldnn has native support for asymmetric padding, so there is no overhead there, but for other backends I resort to padding the input tensor by 1 on the right hand side to make the remaining padding symmetrical. In these cases, I use `TORCH_WARN_ONCE` to notify the user of the performance implications.

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D27170744

Pulled By: jbschlosser

fbshipit-source-id: b3d8a0380e0787ae781f2e5d8ee365a7bfd49f22
2021-03-18 16:22:03 -07:00
Joel Schlosser
e86476f736 Huber loss (#50553)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/48595.

## Background

This PR implements HuberLoss, which differs from SmoothL1Loss by a factor of beta. The current implementation does not share logic between the two. Feedback is welcome for the optimal way to minimize code duplication while remaining performant.

I've done some early [benchmarking](https://pytorch.org/tutorials/recipes/recipes/benchmark.html#collecting-instruction-counts-with-callgrind) with Huber calling in to the Smooth L1 kernel and scaling afterwards; for the simple test case I used, instruction counts are as follows:
```
Huber loss calls dedicated Huber kernel: 2,795,300
Huber loss calls Smooth L1 kernel and scales afterwards: 4,523,612
```
With these numbers, instruction counts are ~62% higher when using the pre-existing Smooth L1 kernel.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50553

Test Plan:
```
python test/test_nn.py TestNN.test_HuberLoss
python test/test_nn.py TestNN.test_HuberLoss_delta
python test/test_nn.py TestNN.test_huber_loss_invalid_delta
python test/test_nn.py TestNNDeviceTypeCPU.test_smooth_l1_loss_vs_huber_loss_cpu
python test/test_nn.py TestNNDeviceTypeCUDA.test_smooth_l1_loss_vs_huber_loss_cuda
python test/test_nn.py TestNNDeviceTypeCPU.test_invalid_reduction_strings_cpu
python test/test_nn.py TestNNDeviceTypeCUDA.test_invalid_reduction_strings_cuda
python test/test_nn.py TestNN.test_loss_equal_input_target_shape
python test/test_nn.py TestNN.test_pointwise_loss_broadcast
python test/test_overrides.py
python test/test_jit.py TestJitGeneratedFunctional.test_nn_huber_loss
python test/test_type_hints.py
python test/test_cpp_api_parity.py
build/bin/test_api
```

## Documentation
<img width="677" alt="Screen Shot 2021-01-14 at 4 25 08 PM" src="https://user-images.githubusercontent.com/75754324/104651224-5a445980-5685-11eb-884b-14ea517958c2.png">
<img width="677" alt="Screen Shot 2021-01-14 at 4 24 35 PM" src="https://user-images.githubusercontent.com/75754324/104651190-4e589780-5685-11eb-974d-8c63a89c050e.png">
<img width="661" alt="Screen Shot 2021-01-14 at 4 24 45 PM" src="https://user-images.githubusercontent.com/75754324/104651198-50225b00-5685-11eb-958e-136b36f6f8a8.png">
<img width="869" alt="Screen Shot 2021-01-14 at 4 25 27 PM" src="https://user-images.githubusercontent.com/75754324/104651208-53b5e200-5685-11eb-9fe4-5ff433aa13c5.png">
<img width="862" alt="Screen Shot 2021-01-14 at 4 25 48 PM" src="https://user-images.githubusercontent.com/75754324/104651209-53b5e200-5685-11eb-8051-b0cfddcb07d3.png">

Reviewed By: H-Huang

Differential Revision: D26734071

Pulled By: jbschlosser

fbshipit-source-id: c98c1b5f32a16f7a2a4e04bdce678080eceed5d5
2021-03-02 17:30:45 -08:00
Brian Hirsh
57637e0ab4 port upsample_nearest3d and upsample_trilinear3d to structured (#52065)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52065

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D26373027

Pulled By: bdhirsh

fbshipit-source-id: 76b283ea8142732ffc8f7b200a8494349739e326
2021-02-22 10:38:52 -08:00
Brian Hirsh
d659477ae0 port upsample_bilinear2d and upsample_bicubic2d to structured (#52012)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52012

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D26356329

Pulled By: bdhirsh

fbshipit-source-id: 8f974224799493e3172fe5dff3fbd43af8c09722
2021-02-22 10:38:48 -08:00
Brian Hirsh
f3ea5ca672 port upsample_linear1d to structured (#51917)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51917

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D26327750

Pulled By: bdhirsh

fbshipit-source-id: 443ad278010ce655eb5f08fa6889c45ccb328268
2021-02-22 10:38:43 -08:00
Arindam Roy
da920fa141 Enable rocm tests in common nn (#51227)
Summary:
Fixes #{issue number}
Resubmitting a new PR as the older one got reverted due to problems in test_optim.py.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/51227

Reviewed By: ezyang

Differential Revision: D26142505

Pulled By: ailzhang

fbshipit-source-id: a2ab5d85630aac2d2ce17652ba19c11ea668a6a9
2021-01-29 12:54:04 -08:00
Jeffrey Wan
c0966914bc Internal gradcheck wrapper in testing._internal that sets certain flags to True (#51133)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/49409

There are many call sites where, gradcheck/gradgradcheck is now being implicitly invoked with `check_batched_grad` as True, but they were previously False. Cases fall into two basic categories:
1) the call site was previously using `torch.autograd.gradcheck` but is now changed to use the globally imported function instead
3) the call site was already using globally imported function, but does not explicitly pass `check_batched_grad` flag

Only in the _assertGradAndGradgradChecks cases, which are infrequent, I assumed that the the author is aware that omitting the flag means not applying check_batched_grad=True. (but maybe that is not the case?)

Overall this PR in its current state assumes that unless the author explicitly specified `check_batched_grad=False`, they were just probably not aware of this flag and did not mean to have this flag as False.

So far exceptions to the above (as discovered by CI) include:
 - Mkldnn (opaque tensors do not have strides) https://app.circleci.com/pipelines/github/pytorch/pytorch/264416/workflows/e4d87886-6247-4305-8526-2696130aa9a4/jobs/10401882/tests
 - all cases in test_sparse (https://app.circleci.com/pipelines/github/pytorch/pytorch/264553/workflows/3c1cbe30-830d-4acd-b240-38d833dccd9b/jobs/10407103)
 - all cases in test_overrides (https://app.circleci.com/pipelines/github/pytorch/pytorch/264553/workflows/3c1cbe30-830d-4acd-b240-38d833dccd9b/jobs/10407236)
 - test_autograd (test_LSTM_grad_and_gradgrad) - (https://app.circleci.com/pipelines/github/pytorch/pytorch/264553/workflows/3c1cbe30-830d-4acd-b240-38d833dccd9b/jobs/10407235)
 - test_data_parallel (test_data_parallel_buffers_requiring_grad) - *SIGSEGV* (https://app.circleci.com/pipelines/github/pytorch/pytorch/264820/workflows/14d89503-040d-4e3d-9f7b-0bc04833589b/jobs/10422697)
 - test_nn (https://app.circleci.com/pipelines/github/pytorch/pytorch/264919/workflows/df79e3ed-8a31-4a8e-b584-858ee99686ff/jobs/10427315)

Possible TODO is to prevent new tests from invoking external gradcheck.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/51133

Reviewed By: ezyang

Differential Revision: D26147919

Pulled By: soulitzer

fbshipit-source-id: dff883b50f337510a89f391ea2fd87de2d531432
2021-01-29 09:13:37 -08:00
mattip
345844d9d8 test, fix deepcopy of tensor with grad (#50663)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/3307

Previously, `self.grad` was not ~cloned~ deepcopied to the returned tensor in `deepcopy`. Added a test and an implementation.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50663

Reviewed By: heitorschueroff

Differential Revision: D26074811

Pulled By: albanD

fbshipit-source-id: 536dad36415f1d03714b4ce57453f406ad802b8c
2021-01-26 16:19:53 -08:00
Richard Zou
83315965ab Turn on batched grad testing for CriterionTest (#50744)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50744

This PR adds a `check_batched_grad=True` option to CriterionTest and
turns it on by default for all CriterionTest-generated tests

Test Plan: - run tests

Reviewed By: ejguan

Differential Revision: D25997676

Pulled By: zou3519

fbshipit-source-id: cc730731e6fae2bddc01bc93800fd0e3de28b32d
2021-01-26 07:37:15 -08:00
Richard Zou
63838b9330 Turn on batched_grad testing for NewModuleTest (#50740)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50740

This PR adds a `check_batched_grad=True` option to
NewModuleTest-generated NN tests.

Test Plan: - run tests (`pytest test/test_nn.py -v -rf`)

Reviewed By: ejguan

Differential Revision: D25997679

Pulled By: zou3519

fbshipit-source-id: b75e73d7e86fd3af9bad6efed7127b36551587b3
2021-01-22 15:33:09 -08:00
Peter Bell
db079a9877 Padding: support complex dtypes (#50594)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50594

Fixes #50234

Test Plan: Imported from OSS

Reviewed By: pbelevich

Differential Revision: D25987316

Pulled By: anjali411

fbshipit-source-id: c298b771fe52b267a86938e886ea402badecfe3e
2021-01-22 11:57:42 -08:00
Peter Bell
47f0bda3ef Improve complex support in common_nn test machinery (#50593)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50593

There are no equivalent to torch.FloatTensor, torch.cuda.FloatTensor for complex
types. So `get_gpu_type` and `get_cpu_type` are broken for complex tensors.

Also found a few places that explicitly cast inputs to floating point types,
which would drop the imaginary component before running the test.

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D25954050

Pulled By: mruberry

fbshipit-source-id: 1fa8e5af233aa095c839d5e2f860564baaf92aef
2021-01-22 09:44:45 -08:00
Natalia Gimelshein
4d169258ef Revert D25976245: [pytorch][PR] Enable Skipped ROCM Tests in common_nn.py
Test Plan: revert-hammer

Differential Revision:
D25976245 (24a0272132)

Original commit changeset: 801032534f91

fbshipit-source-id: 561e6d761cb694451d5f87557b4f96f37d19dd90
2021-01-21 13:28:37 -08:00
Arindam Roy
24a0272132 Enable Skipped ROCM Tests in common_nn.py (#50753)
Summary:
Removed test_cuda=(not TEST_WITH_ROCM)
in common_nn.py to enable the skipped tests
for ROCM.

Signed-off-by: Arindam Roy <rarindam@gmail.com>

Fixes #{issue number}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50753

Reviewed By: mrshenli

Differential Revision: D25976245

Pulled By: ngimel

fbshipit-source-id: 801032534f911d24d231bc9f0d3235a4506412c0
2021-01-21 09:48:47 -08:00
Jeffrey Wan
6e3e57095c Add complex support for torch.nn.L1Loss (#49912)
Summary:
Building on top of the work of anjali411 (https://github.com/pytorch/pytorch/issues/46640)

Things added in this PR:
1. Modify backward and double-backward formulas
2. Add complex support for `new module tests` and criterion tests (and add complex tests for L1)
3. Modify some existing tests to support complex

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49912

Reviewed By: zhangguanheng66

Differential Revision: D25853036

Pulled By: soulitzer

fbshipit-source-id: df619f1b71c450ab2818eb17804e0c55990aa8ad
2021-01-15 15:53:15 -08:00