pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Kurt Mohler	4cfd09d7bc	Reland: Add index value checking to MaxUnpool2d and MaxUnpool3d (#78280 ) Relanding #70545 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78280 Approved by: https://github.com/jbschlosser	2022-06-03 20:09:07 +00:00
samdow	b7cb4eae6b	Fix embedding jvp support by making embedding_renorm ignore forward mode AD (#78560 ) On functorch, we started seeing [embedding forward mode fail](https://github.com/pytorch/functorch/pull/816). From looking at it, we figured out that recently [embedding got forward mode support enabled](`369d9f4137`) and then doing forward mode with embedding and [max_norm doesn't work with gradcheck](https://github.com/pytorch/pytorch/blob/master/torch/testing/_internal/common_methods_invocations.py#L8877-L8881), so it's not checked. What was happening is that `embedding_renorm` was setting `torch.no_grad()` which only turns off the backwards mode AD so functorch's jvp tests were still using forward mode AD during the `embedding_renorm` call. This makes it so that we don't use forward mode during the embedding_renorm call Pull Request resolved: https://github.com/pytorch/pytorch/pull/78560 Approved by: https://github.com/soulitzer, https://github.com/albanD	2022-06-03 19:14:51 +00:00
Eddie Yan	14b0e9e75f	[cuDNN] Don't enforce bitwise exact results in `test_conv_transposed_large_cuda` (#78147 ) `test_conv_transposed_large` expects bitwise perfect results in fp16 on CUDA, but this behavior isn't guaranteed by cuDNN (e.g., in the case of FFT algos). This PR just changes the tolerance on the test to account for these cases. CC @ptrblck @ngimel Pull Request resolved: https://github.com/pytorch/pytorch/pull/78147 Approved by: https://github.com/ngimel	2022-06-03 19:03:24 +00:00
Eddie Yan	b740a99b9e	[cuDNN][TF32] Threshold adjustments for TF32 on `>=sm80` (#78437 ) CC @ptrblck @mcarilli Change to transformer multilayer test can potentially be swapped in favor of an rtol change? (see also: #75612). Pull Request resolved: https://github.com/pytorch/pytorch/pull/78437 Approved by: https://github.com/ngimel	2022-06-03 01:02:56 +00:00
PyTorch MergeBot	d578197747	Revert "Fix embedding jvp support by making embedding_renorm ignore forward mode AD (#78560 )" This reverts commit `ce7c7bb2a9`. Reverted https://github.com/pytorch/pytorch/pull/78560 on behalf of https://github.com/malfet due to broke XLA (on CI and trunk), see `ce7c7bb2a9`	2022-06-02 17:40:34 +00:00
samdow	ce7c7bb2a9	Fix embedding jvp support by making embedding_renorm ignore forward mode AD (#78560 ) On functorch, we started seeing [embedding forward mode fail](https://github.com/pytorch/functorch/pull/816). From looking at it, we figured out that recently [embedding got forward mode support enabled](`369d9f4137`) and then doing forward mode with embedding and [max_norm doesn't work with gradcheck](https://github.com/pytorch/pytorch/blob/master/torch/testing/_internal/common_methods_invocations.py#L8877-L8881), so it's not checked. What was happening is that `embedding_renorm` was setting `torch.no_grad()` which only turns off the backwards mode AD so functorch's jvp tests were still using forward mode AD during the `embedding_renorm` call. This makes it so that we don't use forward mode during the embedding_renorm call Pull Request resolved: https://github.com/pytorch/pytorch/pull/78560 Approved by: https://github.com/soulitzer, https://github.com/albanD	2022-06-02 13:40:21 +00:00
Edward Z. Yang	c20969c40c	Fix ParameterList printing meta tensor Fixes https://github.com/pytorch/pytorch/issues/78250 There are actually two bugs. First, the crash is caused by TensorOptions::backend incorrectly reporting noexcept when it can failed. Second, ParameterList is using torch.tensortype for no good reason; we can just print the dtype instead. Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/78529 Approved by: https://github.com/albanD	2022-06-01 00:46:52 +00:00
mikeiovine	d6db5ea50d	Back out "add mixed data type mode for LayerNorm forward path" Pull Request resolved: https://github.com/pytorch/pytorch/pull/78298 Also back out "improve LayerNorm bfloat16 performance on CPU". These layer norm changes seem fine, but they are causing `LayerNorm` to not use AVX2 instructions, which is causing performance on internal models to degrade. More investigation is needed to find the true root cause, but we should unland to mitigate the issue ASAP. I left `mixed_data_type.h` around since there are some other files depending on it. Differential Revision: [D36675352](https://our.internmc.facebook.com/intern/diff/D36675352/) Approved by: https://github.com/tenpercent	2022-05-26 02:54:13 +00:00
PyTorch MergeBot	c50089712c	Revert "Add index value checking to MaxUnpool2d and MaxUnpool3d (#70545 )" This reverts commit `53ef66bb59`. Reverted https://github.com/pytorch/pytorch/pull/70545 on behalf of https://github.com/malfet due to as it broke cuda-10.2 test on trunk, see `53ef66bb59`	2022-05-23 23:58:43 +00:00
Kurt Mohler	53ef66bb59	Add index value checking to MaxUnpool2d and MaxUnpool3d (#70545 ) Fixes #68727 cc @mruberry @jbschlosser @walterddr @kshitij12345 @ngimel Pull Request resolved: https://github.com/pytorch/pytorch/pull/70545 Approved by: https://github.com/ngimel	2022-05-23 21:08:25 +00:00
yuguo68	c186250d95	raise error when groups is not positive in Conv modules Pull Request resolved: https://github.com/pytorch/pytorch/pull/77919 Approved by: https://github.com/jbschlosser	2022-05-23 20:35:00 +00:00
Jeff Daily	9aed30d3ad	[ROCm] support benchmark flag for MIOpen (#77438 ) Fixes #68172. Generally, this corrects multiple flaky convolution unit test behavior seen on ROCm. The MIOpen integration has been forcing benchmark=True when calling `torch._C._set_cudnn_benchmark(False)`, typically called by `torch.backends.cudnn.set_flags(enabled=True, benchmark=False)`. We now add support for MIOpen immediate mode to avoid benchmarking during MIOpen solution selection. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77438 Approved by: https://github.com/ngimel, https://github.com/malfet	2022-05-23 17:10:24 +00:00
zrphercule	734a97a7c8	Revert "Revert "Switch to use nested tensor by-default in Transformer… (#77924 ) …Encoder (#77217)"" This reverts commit `0d6fa91d1b`. Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/77924 Approved by: https://github.com/atalman	2022-05-20 11:44:03 +00:00
George Qi	f9db8b72ac	MHA forward pass bug fix Pull Request resolved: https://github.com/pytorch/pytorch/pull/77761 Approved by: https://github.com/jbschlosser	2022-05-19 01:21:24 +00:00
Joel Benjamin Schlosser	8881d7ac6c	Support no-batch-dim for CrossEntropyLoss with prob target Pull Request resolved: https://github.com/pytorch/pytorch/pull/77653 Approved by: https://github.com/albanD	2022-05-18 19:51:09 +00:00
Nikita Vedeneev	a760dc2687	`binary_cross_entropy`: double backwart wrt target (#77416 ) As per title. An effort to make `binary_cross_entropy` all around differentiable. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77416 Approved by: https://github.com/soulitzer	2022-05-18 10:29:27 +00:00
Rui Zhu	4e2f5507d0	Add support for TxT mask layout for masked_softmax in BetterTransformer (#77607 ) Summary: Expand mask to BxHxDxD when mask is DxD layout Test Plan: buck build mode/opt -c fbcode.platform=platform009 -c fbcode.enable_gpu_sections=true caffe2/test:nn && buck-out/opt/gen/caffe2/test/nn\#binary.par -r masked_softmax_DxD Differential Revision: D36428170 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77607 Approved by: https://github.com/cpuhrsch	2022-05-18 01:31:05 +00:00
PyTorch MergeBot	d8b80edade	Revert "Use weakref.proxy when saving module to internal dictionaries to not increase refcount (#76435 )" This reverts commit `1aa3cbb83b`. Reverted https://github.com/pytorch/pytorch/pull/76435 on behalf of https://github.com/jbschlosser	2022-05-17 17:51:26 +00:00
mingfeima	c003494754	add channels last support for PixelShuffle and PixelUnshuffle Pull Request resolved: https://github.com/pytorch/pytorch/pull/50573 Approved by: https://github.com/VitalyFedyunin	2022-05-17 17:33:49 +00:00
Edward Z. Yang	b5bc954a71	Fix optional dtype/layout/memory_format pycall; fix memory format Double-header bug fix: - As reported by jansel, dtypes are still showing up as integers when the schema is an optional dtype. This is simple enough to fix and I added a test for it. But while I was at it... - I noticed that the THPMemoryFormat_new idiom with "unused" name doesn't actually work, the repr of the returned memory format object is wrong and this shows up when we try to log the args/kwargs. So I fixed memory format to do it properly along with everything else. Fixes https://github.com/pytorch/pytorch/issues/77135 Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/77543 Approved by: https://github.com/albanD, https://github.com/jansel	2022-05-16 16:46:08 +00:00
mingfeima	8c50414233	add BFloat16 support for BatchNorm on CPU Pull Request resolved: https://github.com/pytorch/pytorch/pull/77496 Approved by: https://github.com/frank-wei	2022-05-16 16:31:18 +00:00
mingfeima	6fa20bdfe8	add native kernel for weight_norm on CPU Pull Request resolved: https://github.com/pytorch/pytorch/pull/73845 Approved by: https://github.com/frank-wei	2022-05-16 06:36:24 +00:00
PyTorch MergeBot	93a969221d	Revert "add BFloat16 support for BatchNorm on CPU" This reverts commit `7c8911ca7a`. Reverted https://github.com/pytorch/pytorch/pull/74410 on behalf of https://github.com/albanD	2022-05-14 14:28:58 +00:00
mingfeima	7c8911ca7a	add BFloat16 support for BatchNorm on CPU Pull Request resolved: https://github.com/pytorch/pytorch/pull/74410 Approved by: https://github.com/frank-wei	2022-05-14 07:49:00 +00:00
Rohan Varma	a275491c6f	[Reland] load_state_dict post hook (#77392 ) Reland of https://github.com/pytorch/pytorch/pull/76823 with fixes to call `__setstate__` for softmax/softmin/logsoftmax as per discussion with @albanD and @jbschlosser. Original description: Implements `register_load_state_dict_post_hook` API as discussed in https://github.com/pytorch/pytorch/issues/75287. Unittests cover: - Ensuring hooks are called with the correct module - Hook is called with `IncompatibleKeys` field - If hook modifies this, load_state_dict returns the modified result Pull Request resolved: https://github.com/pytorch/pytorch/pull/77392 Approved by: https://github.com/jbschlosser	2022-05-14 06:06:23 +00:00
mingfeima	59b56ba785	improve group_norm channels last performance on CPU add channels_last_3d memory format support add BFloat16 support on CPU Pull Request resolved: https://github.com/pytorch/pytorch/pull/69067 Approved by: https://github.com/VitalyFedyunin	2022-05-14 03:13:02 +00:00
Kulin Seth	e011a8e18b	Enable PyTorch operations on MPS Backend. (#77343 ) Add PyTorch operations to MPS backend. - https://github.com/pytorch/pytorch/issues/77394 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77343 Approved by: https://github.com/albanD	2022-05-13 18:28:53 +00:00
mingfeima	2b7943c47c	fix torchvhsion failed case test_classification_model on slow_conv2d Pull Request resolved: https://github.com/pytorch/pytorch/pull/77347 Approved by: https://github.com/datumbox, https://github.com/frank-wei	2022-05-13 08:04:08 +00:00
PyTorch MergeBot	d92b0a51aa	Revert "Load state dict post hook" This reverts commit `56bed0dcfe`. Reverted https://github.com/pytorch/pytorch/pull/76823 on behalf of https://github.com/rohan-varma	2022-05-12 21:00:49 +00:00
ecao	37c6017831	Add BFloat16 support for GLU, and randperm operators on CPU (#61944 ) add BFloat16 support for GLU and randperm operators on CPU Pull Request resolved: https://github.com/pytorch/pytorch/pull/61944 Approved by: https://github.com/frank-wei	2022-05-12 17:41:57 +00:00
yanbing-j	4f82f439d1	Enable BFloat16 ELU, SELU and CELU in CPU path (#62546 ) Enable BFloat16 ELU, SELU and CELU in CPU path. SELU and CELU will call ELU implementation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/62546 Approved by: https://github.com/frank-wei	2022-05-12 16:56:57 +00:00
mingfeima	3b56efd4e1	add mixed data type mode for LayerNorm forward path Pull Request resolved: https://github.com/pytorch/pytorch/pull/73844 Approved by: https://github.com/frank-wei	2022-05-12 03:35:06 +00:00
otaj	1aa3cbb83b	Use weakref.proxy when saving module to internal dictionaries to not increase refcount (#76435 ) Fixes #76434 Pull Request resolved: https://github.com/pytorch/pytorch/pull/76435 Approved by: https://github.com/jbschlosser	2022-05-11 18:40:59 +00:00
mingfeima	3d0e6f169c	add channels last support for slow_conv_dilated2d Pull Request resolved: https://github.com/pytorch/pytorch/pull/70665 Approved by: https://github.com/VitalyFedyunin	2022-05-11 15:28:50 +00:00
Rui Zhu	533b44a280	Add _native nested_tensor_from_mask (#76942 ) Summary: For user to convert nested tensor more easily. Some impl detail might change on user's need. Test Plan: buck test mode/dev caffe2/test:nn -- test_nested_tensor_from_mask Differential Revision: D36191182 Pull Request resolved: https://github.com/pytorch/pytorch/pull/76942 Approved by: https://github.com/jbschlosser	2022-05-11 05:19:36 +00:00
mingfeima	3d561ee926	add channels last support for thnn_conv2d (non-dilated) Pull Request resolved: https://github.com/pytorch/pytorch/pull/68101 Approved by: https://github.com/VitalyFedyunin	2022-05-11 00:09:45 +00:00
neverix	87e543da9b	Add `load_state_dict` error message for non-dicts (#77197 ) Fixes #76886 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77197 Approved by: https://github.com/jbschlosser	2022-05-10 22:11:51 +00:00
Aidyn-A	a127c584a0	Fix max pool forward nhwc (#76597 ) Fixes issue #76432. Added dilation to loops in CUDA kernel. cc @ngimel Pull Request resolved: https://github.com/pytorch/pytorch/pull/76597 Approved by: https://github.com/ngimel	2022-05-10 17:39:48 +00:00
mingfeima	8d4e069e66	add BFloat16 support for UpSample on CPU Pull Request resolved: https://github.com/pytorch/pytorch/pull/76935 Approved by: https://github.com/frank-wei	2022-05-10 16:56:41 +00:00
Scott Wolchok	e5915a2216	[PyTorch] Don't enter MHA fast path when bias & query dtypes don't match Pull Request resolved: https://github.com/pytorch/pytorch/pull/76879 The fast path does not support this: transform_bias_rescale_qkv will try to grab bias.data_ptr() assuming the dtypes are the same. (Also, I have no idea how this happens.) Differential Revision: [D36156872](https://our.internmc.facebook.com/intern/diff/D36156872/) Approved by: https://github.com/cpuhrsch	2022-05-09 18:21:04 +00:00
Rohan Varma	56bed0dcfe	Load state dict post hook Implements `register_load_state_dict_post_hook` API as discussed in https://github.com/pytorch/pytorch/issues/75287. Unittests cover: - Ensuring hooks are called with the correct module - Hook is called with `IncompatibleKeys` field - If hook modifies this, load_state_dict returns the modified result Pull Request resolved: https://github.com/pytorch/pytorch/pull/76823 Approved by: https://github.com/albanD	2022-05-05 19:27:05 +00:00
lkct	b8776e143f	Fix false DeprecationWarning in `Module.state_dict` Fixes #75404 TODO: - [x] add tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/75507 Approved by: https://github.com/jbschlosser	2022-05-04 20:08:23 +00:00
Nikita Shulga	b074bffa41	Revert D28836788: add BFloat16 support for UpSample on CPU Test Plan: revert-hammer Differential Revision: D28836788 (`1399d83bc0`) Original commit changeset: 63dc45e5bb91 Original Phabricator Diff: D28836788 (`1399d83bc0`) fbshipit-source-id: 92733af87cba87aed800473ff44ca6d7af037da9 (cherry picked from commit 1c9fc492503b768a343723e4cf347b30bf5dcfc2)	2022-05-02 23:13:39 +00:00
mingfeima	1399d83bc0	add BFloat16 support for UpSample on CPU (#58297 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58297 Test Plan: Imported from OSS Reviewed By: mikaylagawarecki Differential Revision: D28836788 Pulled By: VitalyFedyunin fbshipit-source-id: 63dc45e5bb91964d5ff1110262228718289435d1 (cherry picked from commit 8a37d607d6a89ccb50364cf54a6f26ca8d05cab9)	2022-05-02 22:33:26 +00:00
Scott Wolchok	e816e17655	[PyTorch] Add native fast path for transformer encoder inference (#76333 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76333 The current PyTorch multi-head attention and transformer implementations are slow. This should speed them up for inference. ghstack-source-id: 154737857 (Note: this ignores all push blocking failures!) Test Plan: CI Reviewed By: cpuhrsch Differential Revision: D35239925 fbshipit-source-id: 5a7eb8ff79bc6afb4b7d45075ddb2a24a6e2df28	2022-04-26 12:58:03 -04:00
Jon Janzen	2387efd356	Revert "[PyTorch] Add native fast path for transformer encoder inference" This reverts commit `b369b89f23`. This has internal changes and should not have been landed via mergebot. Ref: https://github.com/pytorch/pytorch/pull/75809#issuecomment-1108717166	2022-04-25 11:40:02 -04:00
Scott Wolchok	b369b89f23	[PyTorch] Add native fast path for transformer encoder inference Pull Request resolved: https://github.com/pytorch/pytorch/pull/75809 The current PyTorch multi-head attention and transformer implementations are slow. This should speed them up for inference. Differential Revision: [D35239925](https://our.internmc.facebook.com/intern/diff/D35239925/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D35239925/)! Approved by: https://github.com/ezyang	2022-04-25 06:11:36 +00:00
Peter Bell	cb37e7a080	Remove F.pad python implementation Pull Request resolved: https://github.com/pytorch/pytorch/pull/73433 Approved by: https://github.com/albanD, https://github.com/jbschlosser	2022-04-23 00:13:20 +00:00
Joel Benjamin Schlosser	041e6e750a	Fix to support no-batch-dim inputs in ConvTransposeNd._output_padding Pull Request resolved: https://github.com/pytorch/pytorch/pull/76151 Approved by: https://github.com/albanD	2022-04-22 19:25:09 +00:00
Nikita Vedeneev	9e137ee583	more numerically stable cosine_similarity Previous behavior: compute inner product, then normalize. This patch: first normalize, then compute inner product. This should be more numerically stable because it avoids losing precision in inner product for inputs with large norms. By design ensures that cosine similarity is within `[-1.0, +1.0]`, so it should fix [#29442](https://github.com/pytorch/pytorch/issues/29442). P.S. I had to change tests because this implementation handles division by 0 differently. This PR computes cosine similarity as follows: <x/max(eps, \|\|x\|\|), y/max(eps, \|\|y\|\|)>. Let f(x,y) = <x,y>/(\|\|x\|\| * \|\|y\|\|), then df/dx = y/(\|\|x\|\| * \|\|y\|\|) - (\|\|y\|\|/\|\|x\|\| * <x,y> * x)/(\|\|x\|\| * \|\|y\|\|)^2. The changed test checks division by zero in backward when x=0 and y != 0. For this case the non-zero part of the gradient is just y / (\|\|x\|\| * \|\|y\|\|). The previous test evaluates y/(\|\|x\|\| * \|\|y\|\|) to y / eps, and this PR to 1/eps * y/\|\|y\|\|. Pull Request resolved: https://github.com/pytorch/pytorch/pull/31378 Approved by: https://github.com/ezyang, https://github.com/albanD	2022-04-22 09:28:50 +00:00

1 2 3 4 5 ...

1243 Commits