Kurt Mohler
4cfd09d7bc
Reland: Add index value checking to MaxUnpool2d and MaxUnpool3d ( #78280 )
...
Relanding #70545
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78280
Approved by: https://github.com/jbschlosser
2022-06-03 20:09:07 +00:00
samdow
b7cb4eae6b
Fix embedding jvp support by making embedding_renorm ignore forward mode AD ( #78560 )
...
On functorch, we started seeing [embedding forward mode fail](https://github.com/pytorch/functorch/pull/816 ). From looking at it, we figured out that recently [embedding got forward mode support enabled](369d9f4137 ) and then doing forward mode with embedding and [max_norm doesn't work with gradcheck](https://github.com/pytorch/pytorch/blob/master/torch/testing/_internal/common_methods_invocations.py#L8877-L8881 ), so it's not checked.
What was happening is that `embedding_renorm` was setting `torch.no_grad()` which only turns off the backwards mode AD so functorch's jvp tests were still using forward mode AD during the `embedding_renorm` call. This makes it so that we don't use forward mode during the embedding_renorm call
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78560
Approved by: https://github.com/soulitzer , https://github.com/albanD
2022-06-03 19:14:51 +00:00
Eddie Yan
14b0e9e75f
[cuDNN] Don't enforce bitwise exact results in test_conv_transposed_large_cuda ( #78147 )
...
`test_conv_transposed_large` expects bitwise perfect results in fp16 on CUDA, but this behavior isn't guaranteed by cuDNN (e.g., in the case of FFT algos).
This PR just changes the tolerance on the test to account for these cases.
CC @ptrblck @ngimel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78147
Approved by: https://github.com/ngimel
2022-06-03 19:03:24 +00:00
Eddie Yan
b740a99b9e
[cuDNN][TF32] Threshold adjustments for TF32 on >=sm80 ( #78437 )
...
CC @ptrblck @mcarilli
Change to transformer multilayer test can potentially be swapped in favor of an rtol change? (see also: #75612 ).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78437
Approved by: https://github.com/ngimel
2022-06-03 01:02:56 +00:00
PyTorch MergeBot
d578197747
Revert "Fix embedding jvp support by making embedding_renorm ignore forward mode AD ( #78560 )"
...
This reverts commit ce7c7bb2a9 .
Reverted https://github.com/pytorch/pytorch/pull/78560 on behalf of https://github.com/malfet due to broke XLA (on CI and trunk), see ce7c7bb2a9
2022-06-02 17:40:34 +00:00
samdow
ce7c7bb2a9
Fix embedding jvp support by making embedding_renorm ignore forward mode AD ( #78560 )
...
On functorch, we started seeing [embedding forward mode fail](https://github.com/pytorch/functorch/pull/816 ). From looking at it, we figured out that recently [embedding got forward mode support enabled](369d9f4137 ) and then doing forward mode with embedding and [max_norm doesn't work with gradcheck](https://github.com/pytorch/pytorch/blob/master/torch/testing/_internal/common_methods_invocations.py#L8877-L8881 ), so it's not checked.
What was happening is that `embedding_renorm` was setting `torch.no_grad()` which only turns off the backwards mode AD so functorch's jvp tests were still using forward mode AD during the `embedding_renorm` call. This makes it so that we don't use forward mode during the embedding_renorm call
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78560
Approved by: https://github.com/soulitzer , https://github.com/albanD
2022-06-02 13:40:21 +00:00
Edward Z. Yang
c20969c40c
Fix ParameterList printing meta tensor
...
Fixes https://github.com/pytorch/pytorch/issues/78250
There are actually two bugs. First, the crash is caused
by TensorOptions::backend incorrectly reporting noexcept when
it can failed. Second, ParameterList is using torch.tensortype
for no good reason; we can just print the dtype instead.
Signed-off-by: Edward Z. Yang <ezyangfb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78529
Approved by: https://github.com/albanD
2022-06-01 00:46:52 +00:00
mikeiovine
d6db5ea50d
Back out "add mixed data type mode for LayerNorm forward path"
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78298
Also back out "improve LayerNorm bfloat16 performance on CPU".
These layer norm changes seem fine, but they are causing `LayerNorm` to not use AVX2 instructions, which is causing performance on internal models to degrade. More investigation is needed to find the true root cause, but we should unland to mitigate the issue ASAP.
I left `mixed_data_type.h` around since there are some other files depending on it.
Differential Revision: [D36675352](https://our.internmc.facebook.com/intern/diff/D36675352/ )
Approved by: https://github.com/tenpercent
2022-05-26 02:54:13 +00:00
PyTorch MergeBot
c50089712c
Revert "Add index value checking to MaxUnpool2d and MaxUnpool3d ( #70545 )"
...
This reverts commit 53ef66bb59 .
Reverted https://github.com/pytorch/pytorch/pull/70545 on behalf of https://github.com/malfet due to as it broke cuda-10.2 test on trunk, see 53ef66bb59
2022-05-23 23:58:43 +00:00
Kurt Mohler
53ef66bb59
Add index value checking to MaxUnpool2d and MaxUnpool3d ( #70545 )
...
Fixes #68727
cc @mruberry @jbschlosser @walterddr @kshitij12345 @ngimel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70545
Approved by: https://github.com/ngimel
2022-05-23 21:08:25 +00:00
yuguo68
c186250d95
raise error when groups is not positive in Conv modules
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77919
Approved by: https://github.com/jbschlosser
2022-05-23 20:35:00 +00:00
Jeff Daily
9aed30d3ad
[ROCm] support benchmark flag for MIOpen ( #77438 )
...
Fixes #68172 . Generally, this corrects multiple flaky convolution unit test behavior seen on ROCm.
The MIOpen integration has been forcing benchmark=True when calling `torch._C._set_cudnn_benchmark(False)`, typically called by `torch.backends.cudnn.set_flags(enabled=True, benchmark=False)`. We now add support for MIOpen immediate mode to avoid benchmarking during MIOpen solution selection.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77438
Approved by: https://github.com/ngimel , https://github.com/malfet
2022-05-23 17:10:24 +00:00
zrphercule
734a97a7c8
Revert "Revert "Switch to use nested tensor by-default in Transformer… ( #77924 )
...
…Encoder (#77217 )""
This reverts commit 0d6fa91d1b .
Fixes #ISSUE_NUMBER
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77924
Approved by: https://github.com/atalman
2022-05-20 11:44:03 +00:00
George Qi
f9db8b72ac
MHA forward pass bug fix
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77761
Approved by: https://github.com/jbschlosser
2022-05-19 01:21:24 +00:00
Joel Benjamin Schlosser
8881d7ac6c
Support no-batch-dim for CrossEntropyLoss with prob target
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77653
Approved by: https://github.com/albanD
2022-05-18 19:51:09 +00:00
Nikita Vedeneev
a760dc2687
binary_cross_entropy: double backwart wrt target (#77416 )
...
As per title. An effort to make `binary_cross_entropy` all around differentiable.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77416
Approved by: https://github.com/soulitzer
2022-05-18 10:29:27 +00:00
Rui Zhu
4e2f5507d0
Add support for TxT mask layout for masked_softmax in BetterTransformer ( #77607 )
...
Summary: Expand mask to BxHxDxD when mask is DxD layout
Test Plan: buck build mode/opt -c fbcode.platform=platform009 -c fbcode.enable_gpu_sections=true caffe2/test:nn && buck-out/opt/gen/caffe2/test/nn\#binary.par -r masked_softmax_DxD
Differential Revision: D36428170
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77607
Approved by: https://github.com/cpuhrsch
2022-05-18 01:31:05 +00:00
PyTorch MergeBot
d8b80edade
Revert "Use weakref.proxy when saving module to internal dictionaries to not increase refcount ( #76435 )"
...
This reverts commit 1aa3cbb83b .
Reverted https://github.com/pytorch/pytorch/pull/76435 on behalf of https://github.com/jbschlosser
2022-05-17 17:51:26 +00:00
mingfeima
c003494754
add channels last support for PixelShuffle and PixelUnshuffle
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50573
Approved by: https://github.com/VitalyFedyunin
2022-05-17 17:33:49 +00:00
Edward Z. Yang
b5bc954a71
Fix optional dtype/layout/memory_format pycall; fix memory format
...
Double-header bug fix:
- As reported by jansel, dtypes are still showing up as integers
when the schema is an optional dtype. This is simple enough to
fix and I added a test for it. But while I was at it...
- I noticed that the THPMemoryFormat_new idiom with "unused" name
doesn't actually work, the repr of the returned memory format
object is wrong and this shows up when we try to log the args/kwargs.
So I fixed memory format to do it properly along with everything
else.
Fixes https://github.com/pytorch/pytorch/issues/77135
Signed-off-by: Edward Z. Yang <ezyangfb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77543
Approved by: https://github.com/albanD , https://github.com/jansel
2022-05-16 16:46:08 +00:00
mingfeima
8c50414233
add BFloat16 support for BatchNorm on CPU
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77496
Approved by: https://github.com/frank-wei
2022-05-16 16:31:18 +00:00
mingfeima
6fa20bdfe8
add native kernel for weight_norm on CPU
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73845
Approved by: https://github.com/frank-wei
2022-05-16 06:36:24 +00:00
PyTorch MergeBot
93a969221d
Revert "add BFloat16 support for BatchNorm on CPU"
...
This reverts commit 7c8911ca7a .
Reverted https://github.com/pytorch/pytorch/pull/74410 on behalf of https://github.com/albanD
2022-05-14 14:28:58 +00:00
mingfeima
7c8911ca7a
add BFloat16 support for BatchNorm on CPU
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74410
Approved by: https://github.com/frank-wei
2022-05-14 07:49:00 +00:00
Rohan Varma
a275491c6f
[Reland] load_state_dict post hook ( #77392 )
...
Reland of https://github.com/pytorch/pytorch/pull/76823 with fixes to call `__setstate__` for softmax/softmin/logsoftmax as per discussion with @albanD and @jbschlosser. Original description:
Implements `register_load_state_dict_post_hook` API as discussed in https://github.com/pytorch/pytorch/issues/75287 .
Unittests cover:
- Ensuring hooks are called with the correct module
- Hook is called with `IncompatibleKeys` field
- If hook modifies this, load_state_dict returns the modified result
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77392
Approved by: https://github.com/jbschlosser
2022-05-14 06:06:23 +00:00
mingfeima
59b56ba785
improve group_norm channels last performance on CPU
...
add channels_last_3d memory format support
add BFloat16 support on CPU
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69067
Approved by: https://github.com/VitalyFedyunin
2022-05-14 03:13:02 +00:00
Kulin Seth
e011a8e18b
Enable PyTorch operations on MPS Backend. ( #77343 )
...
Add PyTorch operations to MPS backend.
- https://github.com/pytorch/pytorch/issues/77394
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77343
Approved by: https://github.com/albanD
2022-05-13 18:28:53 +00:00
mingfeima
2b7943c47c
fix torchvhsion failed case test_classification_model on slow_conv2d
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77347
Approved by: https://github.com/datumbox , https://github.com/frank-wei
2022-05-13 08:04:08 +00:00
PyTorch MergeBot
d92b0a51aa
Revert "Load state dict post hook"
...
This reverts commit 56bed0dcfe .
Reverted https://github.com/pytorch/pytorch/pull/76823 on behalf of https://github.com/rohan-varma
2022-05-12 21:00:49 +00:00
ecao
37c6017831
Add BFloat16 support for GLU, and randperm operators on CPU ( #61944 )
...
add BFloat16 support for GLU and randperm operators on CPU
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61944
Approved by: https://github.com/frank-wei
2022-05-12 17:41:57 +00:00
yanbing-j
4f82f439d1
Enable BFloat16 ELU, SELU and CELU in CPU path ( #62546 )
...
Enable BFloat16 ELU, SELU and CELU in CPU path. SELU and CELU will call ELU implementation.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62546
Approved by: https://github.com/frank-wei
2022-05-12 16:56:57 +00:00
mingfeima
3b56efd4e1
add mixed data type mode for LayerNorm forward path
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73844
Approved by: https://github.com/frank-wei
2022-05-12 03:35:06 +00:00
otaj
1aa3cbb83b
Use weakref.proxy when saving module to internal dictionaries to not increase refcount ( #76435 )
...
Fixes #76434
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76435
Approved by: https://github.com/jbschlosser
2022-05-11 18:40:59 +00:00
mingfeima
3d0e6f169c
add channels last support for slow_conv_dilated2d
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70665
Approved by: https://github.com/VitalyFedyunin
2022-05-11 15:28:50 +00:00
Rui Zhu
533b44a280
Add _native nested_tensor_from_mask ( #76942 )
...
Summary: For user to convert nested tensor more easily. Some impl detail might change on user's need.
Test Plan: buck test mode/dev caffe2/test:nn -- test_nested_tensor_from_mask
Differential Revision: D36191182
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76942
Approved by: https://github.com/jbschlosser
2022-05-11 05:19:36 +00:00
mingfeima
3d561ee926
add channels last support for thnn_conv2d (non-dilated)
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68101
Approved by: https://github.com/VitalyFedyunin
2022-05-11 00:09:45 +00:00
neverix
87e543da9b
Add load_state_dict error message for non-dicts ( #77197 )
...
Fixes #76886
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77197
Approved by: https://github.com/jbschlosser
2022-05-10 22:11:51 +00:00
Aidyn-A
a127c584a0
Fix max pool forward nhwc ( #76597 )
...
Fixes issue #76432 .
Added dilation to loops in CUDA kernel.
cc @ngimel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76597
Approved by: https://github.com/ngimel
2022-05-10 17:39:48 +00:00
mingfeima
8d4e069e66
add BFloat16 support for UpSample on CPU
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76935
Approved by: https://github.com/frank-wei
2022-05-10 16:56:41 +00:00
Scott Wolchok
e5915a2216
[PyTorch] Don't enter MHA fast path when bias & query dtypes don't match
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76879
The fast path does not support this: transform_bias_rescale_qkv will try to grab bias.data_ptr() assuming the dtypes are the same. (Also, I have no idea how this happens.)
Differential Revision: [D36156872](https://our.internmc.facebook.com/intern/diff/D36156872/ )
Approved by: https://github.com/cpuhrsch
2022-05-09 18:21:04 +00:00
Rohan Varma
56bed0dcfe
Load state dict post hook
...
Implements `register_load_state_dict_post_hook` API as discussed in https://github.com/pytorch/pytorch/issues/75287 .
Unittests cover:
- Ensuring hooks are called with the correct module
- Hook is called with `IncompatibleKeys` field
- If hook modifies this, load_state_dict returns the modified result
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76823
Approved by: https://github.com/albanD
2022-05-05 19:27:05 +00:00
lkct
b8776e143f
Fix false DeprecationWarning in Module.state_dict
...
Fixes #75404
TODO:
- [x] add tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75507
Approved by: https://github.com/jbschlosser
2022-05-04 20:08:23 +00:00
Nikita Shulga
b074bffa41
Revert D28836788: add BFloat16 support for UpSample on CPU
...
Test Plan: revert-hammer
Differential Revision:
D28836788 (1399d83bc0 )
Original commit changeset: 63dc45e5bb91
Original Phabricator Diff: D28836788 (1399d83bc0 )
fbshipit-source-id: 92733af87cba87aed800473ff44ca6d7af037da9
(cherry picked from commit 1c9fc492503b768a343723e4cf347b30bf5dcfc2)
2022-05-02 23:13:39 +00:00
mingfeima
1399d83bc0
add BFloat16 support for UpSample on CPU ( #58297 )
...
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58297
Test Plan: Imported from OSS
Reviewed By: mikaylagawarecki
Differential Revision: D28836788
Pulled By: VitalyFedyunin
fbshipit-source-id: 63dc45e5bb91964d5ff1110262228718289435d1
(cherry picked from commit 8a37d607d6a89ccb50364cf54a6f26ca8d05cab9)
2022-05-02 22:33:26 +00:00
Scott Wolchok
e816e17655
[PyTorch] Add native fast path for transformer encoder inference ( #76333 )
...
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76333
The current PyTorch multi-head attention and transformer
implementations are slow. This should speed them up for inference.
ghstack-source-id: 154737857
(Note: this ignores all push blocking failures!)
Test Plan: CI
Reviewed By: cpuhrsch
Differential Revision: D35239925
fbshipit-source-id: 5a7eb8ff79bc6afb4b7d45075ddb2a24a6e2df28
2022-04-26 12:58:03 -04:00
Jon Janzen
2387efd356
Revert "[PyTorch] Add native fast path for transformer encoder inference"
...
This reverts commit b369b89f23 .
This has internal changes and should not have been landed via mergebot.
Ref: https://github.com/pytorch/pytorch/pull/75809#issuecomment-1108717166
2022-04-25 11:40:02 -04:00
Scott Wolchok
b369b89f23
[PyTorch] Add native fast path for transformer encoder inference
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75809
The current PyTorch multi-head attention and transformer
implementations are slow. This should speed them up for inference.
Differential Revision: [D35239925](https://our.internmc.facebook.com/intern/diff/D35239925/ )
**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D35239925/ )!
Approved by: https://github.com/ezyang
2022-04-25 06:11:36 +00:00
Peter Bell
cb37e7a080
Remove F.pad python implementation
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73433
Approved by: https://github.com/albanD , https://github.com/jbschlosser
2022-04-23 00:13:20 +00:00
Joel Benjamin Schlosser
041e6e750a
Fix to support no-batch-dim inputs in ConvTransposeNd._output_padding
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76151
Approved by: https://github.com/albanD
2022-04-22 19:25:09 +00:00
Nikita Vedeneev
9e137ee583
more numerically stable cosine_similarity
...
**Previous behavior**: compute inner product, then normalize.
**This patch**: first normalize, then compute inner product. This should be more numerically stable because it avoids losing precision in inner product for inputs with large norms.
By design ensures that cosine similarity is within `[-1.0, +1.0]`, so it should fix [#29442 ](https://github.com/pytorch/pytorch/issues/29442 ).
P.S. I had to change tests because this implementation handles division by 0 differently.
This PR computes cosine similarity as follows: <x/max(eps, ||x||), y/max(eps, ||y||)>.
Let f(x,y) = <x,y>/(||x|| * ||y||), then
df/dx = y/(||x|| * ||y||) - (||y||/||x|| * <x,y> * x)/(||x|| * ||y||)^2.
The changed test checks division by zero in backward when x=0 and y != 0.
For this case the non-zero part of the gradient is just y / (||x|| * ||y||).
The previous test evaluates y/(||x|| * ||y||) to y / eps, and this PR to 1/eps * y/||y||.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31378
Approved by: https://github.com/ezyang , https://github.com/albanD
2022-04-22 09:28:50 +00:00