pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
mingfeima	84b1c9798c	add BFloat16 support for AvgPool2d on CPU (#66927 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66927 Test Plan: Imported from OSS Reviewed By: mikaylagawarecki Differential Revision: D33353198 Pulled By: VitalyFedyunin fbshipit-source-id: 1aeaa4bb90ac99210b8f6051c09d6995d06ce3a1	2022-01-14 07:59:10 -08:00
mingfeima	910c01020e	add BFloat16 support for AdaptiveMaxPool2d on CPU (#66929 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66929 Test Plan: Imported from OSS Reviewed By: mikaylagawarecki Differential Revision: D33353199 Pulled By: VitalyFedyunin fbshipit-source-id: d402d5deb7ca766259ca42118ddc16625e134c4c	2022-01-13 20:00:42 -08:00
Jake Tae	eac3decf93	ModuleList concatenation (#70887 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/70441. Pull Request resolved: https://github.com/pytorch/pytorch/pull/70887 Reviewed By: ejguan Differential Revision: D33555431 Pulled By: albanD fbshipit-source-id: ce42459ee46a611e98e89f02686acbac16b6b668	2022-01-13 15:31:07 -08:00
mingfeima	385773cb77	add BFloat16 support for MaxPool2d on CPU (#56903 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56903 Test Plan: Imported from OSS Reviewed By: mikaylagawarecki Differential Revision: D28836791 Pulled By: VitalyFedyunin fbshipit-source-id: e03d55cc30dfa3628f096938fbad34b1031948af	2022-01-12 14:20:20 -08:00
Ilya Persky	a8612cd72a	Skip failing tests in test_nn if compiled without LAPACK (#70913 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/70912 Pull Request resolved: https://github.com/pytorch/pytorch/pull/70913 Reviewed By: mruberry Differential Revision: D33534840 Pulled By: albanD fbshipit-source-id: 0facf5682140ecd7a78edb34b9cd997f9319e084	2022-01-11 12:21:18 -08:00
George Qi	d7db5fb462	ctc loss no batch dim support (#70092 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70092 Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D33280068 Pulled By: george-qi fbshipit-source-id: 3278fb2d745a396fe27c00fb5f40df0e7f584f81	2022-01-07 14:33:22 -08:00
Bin Bao	f135438d3b	Dispatch to at::convolution intead of at::_convolution in _convolution_double_backward (#70661 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70661 Dispatching to at::convolution can make Lazy Tensor trace the right convolution op. Test Plan: pytest test/test_nn.py -k test_conv_double_backward_strided_with_3D_input_and_weight Reviewed By: wconstab, jbschlosser Differential Revision: D33428780 Pulled By: desertfire fbshipit-source-id: 899e4135588ea33fff23d16103c25d9bcd3f902c	2022-01-07 07:53:46 -08:00
Joel Schlosser	e6befbe85c	Add flag to optionally average output attention weights across heads (#70055 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/47583 Pull Request resolved: https://github.com/pytorch/pytorch/pull/70055 Reviewed By: bhosmer Differential Revision: D33457866 Pulled By: jbschlosser fbshipit-source-id: 17746b3668b0148c1e1ed8333227b7c42f1e3bf5	2022-01-06 17:32:37 -08:00
Joel Schlosser	7b8f73dd32	No-batch-dim support for ConvNd (#70506 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70506 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33355034 Pulled By: jbschlosser fbshipit-source-id: 5a42645299b1d82cee7d461826acca1c5b35a71c	2022-01-06 16:53:50 -08:00
Jake Tae	b7742b437a	Allow RNN hidden_size to be 0 (#70556 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/56767. Pull Request resolved: https://github.com/pytorch/pytorch/pull/70556 Reviewed By: ngimel Differential Revision: D33455156 Pulled By: jbschlosser fbshipit-source-id: 5dc57b09d7beb6ae81dfabc318e87c109bb4e6ae	2022-01-06 14:18:36 -08:00
Jane Xu	c00d33033c	Remove repeat test for types in test nn (#70872 ) Summary: Helps fix a part of https://github.com/pytorch/pytorch/issues/69865 The first commit just migrates everything as is. The second commit uses the "device" variable instead of passing "cuda" everywhere Pull Request resolved: https://github.com/pytorch/pytorch/pull/70872 Reviewed By: jbschlosser Differential Revision: D33455941 Pulled By: janeyx99 fbshipit-source-id: 9d9ec8c95f1714c40d55800e652ccd69b0c314dc	2022-01-06 09:20:02 -08:00
soulitzer	3051aabd0e	Add forward AD formulas for convolution and some others (#69956 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69956 Test Plan: Imported from OSS Reviewed By: albanD, bdhirsh Differential Revision: D33235974 Pulled By: soulitzer fbshipit-source-id: ea60d687edc5d62d92f3fd3cb6640421d32c908c	2022-01-06 08:39:51 -08:00
Joel Schlosser	b60b1b100f	Set cuDNN deterministic flag for test_conv_double_backward_cuda (#69941 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/69833 Pull Request resolved: https://github.com/pytorch/pytorch/pull/69941 Reviewed By: george-qi Differential Revision: D33430727 Pulled By: jbschlosser fbshipit-source-id: 4a250bd0e5460ee631730afe0ab68ba72f37d292	2022-01-05 10:05:56 -08:00
kshitij12345	7bfaa230be	[nn] adaptive_avg_pool{1/2/3}d : Error on negative `output_size` (#70488 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/70232 Pull Request resolved: https://github.com/pytorch/pytorch/pull/70488 Reviewed By: H-Huang Differential Revision: D33367289 Pulled By: jbschlosser fbshipit-source-id: 6b7b89d72c4e1e049ad6a0addb22a261c28ddb4c	2021-12-30 14:42:11 -08:00
mingfeima	401a6b682b	add BFloat16 support for AdaptiveAvgPool2d on CPU (#56902 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56902 Test Plan: Imported from OSS Reviewed By: mikaylagawarecki Differential Revision: D28836789 Pulled By: VitalyFedyunin fbshipit-source-id: caac5e5b15190b8010bbfbc6920aa44032208ee7	2021-12-30 11:58:37 -08:00
vfdev	d2abf3f981	Added antialias flag to interpolate (CPU only, bicubic) (#68819 ) Summary: Description: - Added antialias flag to interpolate (CPU only) - forward and backward for bicubic mode - added tests Previous PR for bilinear, https://github.com/pytorch/pytorch/pull/65142 ### Benchmarks <details> <summary> Forward pass, CPU. PTH interpolation vs PIL </summary> Cases: - PTH RGB 3 Channels, float32 vs PIL RGB uint8 (apples vs pears) - PTH 1 Channel, float32 vs PIL 1 Channel Float Code: https://gist.github.com/vfdev-5/b173761a567f2283b3c649c3c0574112 ``` Torch config: PyTorch built with: - GCC 9.3 - C++ Version: 201402 - OpenMP 201511 (a.k.a. OpenMP 4.5) - CPU capability usage: AVX2 - CUDA Runtime 11.1 - NVCC architecture flags: -gencode;arch=compute_61,code=sm_61 - CuDNN 8.0.5 - Build settings: BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_PYTORCH_QNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.11.0, USE_CUDA=1, USE_CUDNN=1, USE_EIGEN_FOR_BLAS=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=OFF, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=0, USE_OPENMP=ON, USE_ROCM=OFF, Num threads: 1 [------------------- Downsampling (bicubic): torch.Size([1, 3, 906, 438]) -> (320, 196) -------------------] \| Reference, PIL 8.4.0, mode: RGB \| 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------------------------- channels_first contiguous torch.float32 \| 4.5 \| 5.2 channels_last non-contiguous torch.float32 \| 4.5 \| 5.3 Times are in milliseconds (ms). [------------------- Downsampling (bicubic): torch.Size([1, 3, 906, 438]) -> (460, 220) -------------------] \| Reference, PIL 8.4.0, mode: RGB \| 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------------------------- channels_first contiguous torch.float32 \| 5.7 \| 6.4 channels_last non-contiguous torch.float32 \| 5.7 \| 6.4 Times are in milliseconds (ms). [------------------- Downsampling (bicubic): torch.Size([1, 3, 906, 438]) -> (120, 96) --------------------] \| Reference, PIL 8.4.0, mode: RGB \| 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------------------------- channels_first contiguous torch.float32 \| 3.0 \| 4.0 channels_last non-contiguous torch.float32 \| 2.9 \| 4.1 Times are in milliseconds (ms). [------------------ Downsampling (bicubic): torch.Size([1, 3, 906, 438]) -> (1200, 196) -------------------] \| Reference, PIL 8.4.0, mode: RGB \| 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------------------------- channels_first contiguous torch.float32 \| 14.7 \| 17.1 channels_last non-contiguous torch.float32 \| 14.8 \| 17.2 Times are in milliseconds (ms). [------------------ Downsampling (bicubic): torch.Size([1, 3, 906, 438]) -> (120, 1200) -------------------] \| Reference, PIL 8.4.0, mode: RGB \| 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------------------------- channels_first contiguous torch.float32 \| 3.5 \| 3.9 channels_last non-contiguous torch.float32 \| 3.5 \| 3.9 Times are in milliseconds (ms). [---------- Downsampling (bicubic): torch.Size([1, 1, 906, 438]) -> (320, 196) ---------] \| Reference, PIL 8.4.0, mode: F \| 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------ contiguous torch.float32 \| 2.4 \| 1.8 Times are in milliseconds (ms). [---------- Downsampling (bicubic): torch.Size([1, 1, 906, 438]) -> (460, 220) ---------] \| Reference, PIL 8.4.0, mode: F \| 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------ contiguous torch.float32 \| 3.1 \| 2.2 Times are in milliseconds (ms). [---------- Downsampling (bicubic): torch.Size([1, 1, 906, 438]) -> (120, 96) ----------] \| Reference, PIL 8.4.0, mode: F \| 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------ contiguous torch.float32 \| 1.6 \| 1.4 Times are in milliseconds (ms). [--------- Downsampling (bicubic): torch.Size([1, 1, 906, 438]) -> (1200, 196) ---------] \| Reference, PIL 8.4.0, mode: F \| 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------ contiguous torch.float32 \| 7.9 \| 5.7 Times are in milliseconds (ms). [--------- Downsampling (bicubic): torch.Size([1, 1, 906, 438]) -> (120, 1200) ---------] \| Reference, PIL 8.4.0, mode: F \| 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------ contiguous torch.float32 \| 1.7 \| 1.3 Times are in milliseconds (ms). ``` </details> Code is moved from torchvision: https://github.com/pytorch/vision/pull/3810 and https://github.com/pytorch/vision/pull/4208 Pull Request resolved: https://github.com/pytorch/pytorch/pull/68819 Reviewed By: mikaylagawarecki Differential Revision: D33339117 Pulled By: jbschlosser fbshipit-source-id: 6a0443bbba5439f52c7dbc1be819b75634cf67c4	2021-12-29 14:04:43 -08:00
George Qi	8af39b7668	AdaptiveLogSoftmaxWithLoss no_batch_dim support (#69054 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69054 Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D33200166 Pulled By: george-qi fbshipit-source-id: 9d953744351a25f372418d2a64e8402356d1e9b7	2021-12-29 10:25:26 -08:00
soulitzer	3116d87024	Add forward AD formulas for `{adaptive_,fractional_,}max_pool{2,3}d_{backward,}` (#69884 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69884 Also fixes: https://github.com/pytorch/pytorch/issues/69322, https://github.com/pytorch/pytorch/issues/69325 Test Plan: Imported from OSS Reviewed By: bdhirsh Differential Revision: D33093039 Pulled By: soulitzer fbshipit-source-id: b9a522a00f4e9e85974888de5058de07280f8f66	2021-12-23 15:51:09 -08:00
soulitzer	5651e1e3ad	Add auto_linear formulas and some others (#69727 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69727 Still need to test the backward ones. We would need to update gradgradcheck to check forward over backward. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33031728 Pulled By: soulitzer fbshipit-source-id: 86c59df5d2196b5c8dbbb1efed9321e02ab46d30	2021-12-20 12:15:25 -08:00
Albert Liang	0d06616c47	Add `dict` methods to `ParameterDict` (#69403 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/68476 We implemented all of the following `dict` methods for `ParameterDict` - `get ` - `setdefault` - `popitem` - `fromkeys` - `copy` - `__or__` - `__ior__` - `__reversed__` - `__ror__` The behavior of these new methods matches the expected behavior of python `dict` as defined by the language itself: https://docs.python.org/3/library/stdtypes.html#typesmapping Pull Request resolved: https://github.com/pytorch/pytorch/pull/69403 Reviewed By: albanD Differential Revision: D33187111 Pulled By: jbschlosser fbshipit-source-id: ecaa493837dbc9d8566ddbb113b898997e2debcb	2021-12-17 10:15:47 -08:00
Rui Zhu	46ace4ac33	Add support for masked_softmax when softmax_elements > 1024 & corresponding unit tests (#69924 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69924 Test Plan: buck build mode/opt -c fbcode.enable_gpu_sections=true caffe2/test:nn && buck-out/gen/caffe2/test/nn\#binary.par -r test_masked_softmax Reviewed By: ngimel Differential Revision: D32819181 fbshipit-source-id: 6838a11d3554ec8e1bd48f1c2c7b1ee3a4680995	2021-12-15 16:44:15 -08:00
kshitij12345	e8d5c7cf7f	[nn] mha : no-batch-dim support (python) (#67176 ) Summary: Reference: https://github.com/pytorch/pytorch/issues/60585 * [x] Update docs * [x] Tests for shape checking Tests take roughly 20s on system that I use. Below is the timings for slowest 20 tests. ``` pytest test/test_modules.py -k _multih --durations=20 ============================================================================================== test session starts =============================================================================================== platform linux -- Python 3.10.0, pytest-6.2.5, py-1.10.0, pluggy-1.0.0 rootdir: /home/kshiteej/Pytorch/pytorch_no_batch_mha, configfile: pytest.ini plugins: hypothesis-6.23.2, repeat-0.9.1 collected 372 items / 336 deselected / 36 selected test/test_modules.py ..............ssssssss.............. [100%] ================================================================================================ warnings summary ================================================================================================ ../../.conda/envs/pytorch-cuda-dev/lib/python3.10/site-packages/torch/backends/cudnn/__init__.py:73 test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MultiheadAttention_cuda_float32 /home/kshiteej/.conda/envs/pytorch-cuda-dev/lib/python3.10/site-packages/torch/backends/cudnn/__init__.py:73: UserWarning: PyTorch was compiled without cuDNN/MIOpen support. To use cuDNN/MIOpen, rebuild PyTorch making sure the library is visible to the build system. warnings.warn( -- Docs: https://docs.pytest.org/en/stable/warnings.html ============================================================================================== slowest 20 durations ============================================================================================== 8.66s call test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_MultiheadAttention_cuda_float64 2.02s call test/test_modules.py::TestModuleCPU::test_gradgrad_nn_MultiheadAttention_cpu_float64 1.89s call test/test_modules.py::TestModuleCUDA::test_grad_nn_MultiheadAttention_cuda_float64 1.01s call test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MultiheadAttention_cuda_float32 0.51s call test/test_modules.py::TestModuleCPU::test_grad_nn_MultiheadAttention_cpu_float64 0.46s call test/test_modules.py::TestModuleCUDA::test_forward_nn_MultiheadAttention_cuda_float32 0.45s call test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MultiheadAttention_cuda_float64 0.44s call test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MultiheadAttention_cuda_float32 0.21s call test/test_modules.py::TestModuleCUDA::test_pickle_nn_MultiheadAttention_cuda_float64 0.21s call test/test_modules.py::TestModuleCUDA::test_pickle_nn_MultiheadAttention_cuda_float32 0.18s call test/test_modules.py::TestModuleCUDA::test_forward_nn_MultiheadAttention_cuda_float64 0.17s call test/test_modules.py::TestModuleCPU::test_non_contiguous_tensors_nn_MultiheadAttention_cpu_float32 0.16s call test/test_modules.py::TestModuleCPU::test_non_contiguous_tensors_nn_MultiheadAttention_cpu_float64 0.11s call test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MultiheadAttention_cuda_float64 0.08s call test/test_modules.py::TestModuleCPU::test_pickle_nn_MultiheadAttention_cpu_float32 0.08s call test/test_modules.py::TestModuleCPU::test_pickle_nn_MultiheadAttention_cpu_float64 0.06s call test/test_modules.py::TestModuleCUDA::test_repr_nn_MultiheadAttention_cuda_float64 0.06s call test/test_modules.py::TestModuleCUDA::test_repr_nn_MultiheadAttention_cuda_float32 0.06s call test/test_modules.py::TestModuleCPU::test_forward_nn_MultiheadAttention_cpu_float32 0.06s call test/test_modules.py::TestModuleCPU::test_forward_nn_MultiheadAttention_cpu_float64 ============================================================================================ short test summary info ============================================================================================= =========================================================================== 28 passed, 8 skipped, 336 deselected, 2 warnings in 19.71s =========================================================================== ``` cc albanD mruberry jbschlosser walterddr Pull Request resolved: https://github.com/pytorch/pytorch/pull/67176 Reviewed By: dagitses Differential Revision: D33094285 Pulled By: jbschlosser fbshipit-source-id: 0dd08261b8a457bf8bad5c7f3f6ded14b0beaf0d	2021-12-14 13:21:21 -08:00
Rui Zhu	1a299d8f1b	Add support for transformer layout of masked_softmax (#69272 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69272 In transformer encoder and MHA, masked_softmax's mask is a 2D tensor (B, D), where input is a 4D tensor (B, H, D, D). This mask could be simply broadcasted to a (B, H, D, D) like input, and then do a regular masked_softmax, however it will bring the problem of non-contiguous mask & consume more memory. In this diff, we maintained mask's shape unchanged, while calc the corresponding mask for input in each cuda thread. This new layout is not currently supported in CPU yet. Test Plan: buck build mode/opt -c fbcode.enable_gpu_sections=true caffe2/test:nn && buck-out/gen/caffe2/test/nn\#binary.par -r test_masked_softmax Reviewed By: ngimel Differential Revision: D32605557 fbshipit-source-id: ef37f86981fdb2fb264d776f0e581841de5d68d2	2021-12-14 10:51:58 -08:00
Joel Schlosser	fc37e5b3ed	Hook up general convolution to convolution_backward (#69584 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69584 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D32936380 Pulled By: jbschlosser fbshipit-source-id: c6fdd88db33bd1a9d0eabea47ae09a4d5b170e92	2021-12-12 17:30:01 -08:00
Joel Schlosser	f0e98dcbd3	General convolution_backward function (#69044 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69044 Test Plan: Imported from OSS Reviewed By: zou3519, albanD, H-Huang Differential Revision: D32708818 Pulled By: jbschlosser fbshipit-source-id: e563baa3197811d8d51553fc83718ace2f8d1b7a	2021-12-12 15:53:38 -08:00
Rui Zhu	aab67c6dff	Add native masked_softmax (#69268 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69268 This diff enabled native masked softmax on CUDA, also expanded our current warp_softmax to accept masking. The mask in this masked softmax has to be the same shape as input, and has to be contiguous. In a following diff I will submit later, I will have encoder mask layout included, where input is BHDD and mask is BD. Test Plan: buck build mode/opt -c fbcode.enable_gpu_sections=true caffe2/test:nn && buck-out/gen/caffe2/test/nn\#binary.par -r test_masked_softmax Reviewed By: ngimel Differential Revision: D32338419 fbshipit-source-id: 48c3fde793ad4535725d9dae712db42e2bdb8a49	2021-12-09 23:29:45 -08:00
kshitij12345	7407e3d6fd	[fix] cross_entropy : fix weight with ignore_index and label_smoothing (#69511 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/69339 cc albanD mruberry jbschlosser walterddr Pull Request resolved: https://github.com/pytorch/pytorch/pull/69511 Reviewed By: mrshenli Differential Revision: D32951935 Pulled By: jbschlosser fbshipit-source-id: 482eae851861a32f96bd6231dd3448fb6d44a015	2021-12-08 12:08:33 -08:00
jjsjann123	3c1e2ff9eb	fixing layer_norm cuda bug (#69210 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/69208 Pull Request resolved: https://github.com/pytorch/pytorch/pull/69210 Reviewed By: H-Huang Differential Revision: D32764811 Pulled By: ngimel fbshipit-source-id: fb4201fe5f2284fcb22e36bc1029eef4a21b09bf	2021-12-01 15:46:47 -08:00
Kurt Mohler	d507fd63f3	Check that block height and width are positive in `nn.Fold` (#69048 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/68875 cc albanD mruberry jbschlosser walterddr Pull Request resolved: https://github.com/pytorch/pytorch/pull/69048 Reviewed By: samdow Differential Revision: D32729307 Pulled By: jbschlosser fbshipit-source-id: 162cafb005873012d900d86997d07640967038c0	2021-12-01 10:08:47 -08:00
Omkar Salpekar	8e343ba5db	Revert D32611368: [pytorch][PR] Initial version of general convolution_backward Test Plan: revert-hammer Differential Revision: D32611368 (`445b31abff`) Original commit changeset: 26d759b7c908 fbshipit-source-id: e91f45f0f31150e60d657a3964b7e42027beff58	2021-11-23 13:39:36 -08:00
Joel Schlosser	445b31abff	Initial version of general convolution_backward (#65219 ) Summary: Towards [convolution consolidation](https://fb.quip.com/tpDsAYtO15PO). Introduces the general `convolution_backward` function that uses the factored-out backend routing logic from the forward function. Some notes: * `finput` is now recomputed in the backward pass for the slow 2d / 3d kernels instead of being saved from the forward pass. The logic for is based on the forward computation and is present in `compute_finput2d` / `compute_finput3d` functions in `ConvUtils.h`. * Using structured kernels for `convolution_backward` requires extra copying since the backend-specific backward functions return tensors. Porting to structured is left as future work. * The tests that check the routing logic have been renamed from `test_conv_backend_selection` -> `test_conv_backend` and now also include gradcheck validation using an `autograd.Function` hooking up `convolution` to `convolution_backward`. This was done to ensure that gradcheck passes for the same set of inputs / backends. The forward pass routing is done as shown in this flowchart (probably need to download it for it to be readable since it's ridiculous): ![conv_routing_graph md](https://user-images.githubusercontent.com/75754324/137186002-5bca75ca-f911-4e61-8245-ec07af841506.png) ![conv_nogroup_routing_graph md](https://user-images.githubusercontent.com/75754324/139731619-9d0d436e-cce3-4bc3-8eaf-d469f667f0d7.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/65219 Reviewed By: mruberry Differential Revision: D32611368 Pulled By: jbschlosser fbshipit-source-id: 26d759b7c908ab8f19ecce627acea7bd3d5f59ba	2021-11-23 08:19:45 -08:00
soulitzer	7bb401a4c9	Add forward AD support for miscellanous operators (#67820 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67820 Original PR here: https://github.com/pytorch/pytorch/pull/67040 Test Plan: Imported from OSS Reviewed By: gchanan Differential Revision: D32314423 Pulled By: soulitzer fbshipit-source-id: ecd898dc903692cab084f6922a1d86986f957b1b	2021-11-19 14:31:06 -08:00
jiej	ca92111758	Add native_dropout (#63937 ) Summary: Adds native_dropout to have a reasonable target for torchscript in auto diff. native_dropout has scale and train as arguments in its signature, this makes native_dropout more consistent with other operators and removes conditionals in the autodiff definition. cc gmagogsfm Pull Request resolved: https://github.com/pytorch/pytorch/pull/63937 Reviewed By: mruberry Differential Revision: D32477657 Pulled By: ngimel fbshipit-source-id: d37b137a37acafa50990f60c77f5cea2818454e4	2021-11-18 19:41:10 -08:00
kshitij12345	d5d2096dab	[testing] make @dtypes mandatory when using @dtypesIf (#68186 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/53647 With this if a test forgets to add `dtypes` while using `dtypesIf`, following error is raised ``` AssertionError: dtypes is mandatory when using dtypesIf however 'test_exponential_no_zero' didn't specify it ``` Tested Locally Pull Request resolved: https://github.com/pytorch/pytorch/pull/68186 Reviewed By: VitalyFedyunin Differential Revision: D32468581 Pulled By: mruberry fbshipit-source-id: 805e0855f988b77a5d8d4cd52b31426c04c2200b	2021-11-18 08:29:31 -08:00
vfdev-5	3da2e09c9b	Added antialias flag to interpolate (CPU only, bilinear) (#65142 ) Summary: Description: - Added antialias flag to interpolate (CPU only) - forward and backward for bilinear mode - added tests ### Benchmarks <details> <summary> Forward pass, CPU. PTH interpolation vs PIL </summary> Cases: - PTH RGB 3 Channels, float32 vs PIL RGB uint8 (apply vs pears) - PTH 1 Channel, float32 vs PIL 1 Channel Float Code: https://gist.github.com/vfdev-5/b173761a567f2283b3c649c3c0574112 ``` # OMP_NUM_THREADS=1 python bench_interp_aa_vs_pillow.py Torch config: PyTorch built with: - GCC 9.3 - C++ Version: 201402 - OpenMP 201511 (a.k.a. OpenMP 4.5) - CPU capability usage: AVX2 - CUDA Runtime 11.1 - NVCC architecture flags: -gencode;arch=compute_75,code=sm_75 - CuDNN 8.0.5 - Build settings: BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_PYTORCH_QNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.10.0, USE_CUDA=1, USE_CUDNN=1, USE_EIGEN_FOR_BLAS=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=OFF, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=0, USE_OPENMP=ON, Num threads: 1 [------------------------ Downsampling: torch.Size([1, 3, 906, 438]) -> (320, 196) ------------------------] \| Reference, PIL 8.3.2, mode: RGB \| 1.10.0a0+git1e87d91 1 threads: ------------------------------------------------------------------------------------------------- channels_first contiguous torch.float32 \| 2.9 \| 3.1 channels_last non-contiguous torch.float32 \| 2.6 \| 3.6 Times are in milliseconds (ms). [------------------------ Downsampling: torch.Size([1, 3, 906, 438]) -> (460, 220) ------------------------] \| Reference, PIL 8.3.2, mode: RGB \| 1.10.0a0+git1e87d91 1 threads: ------------------------------------------------------------------------------------------------- channels_first contiguous torch.float32 \| 3.4 \| 4.0 channels_last non-contiguous torch.float32 \| 3.4 \| 4.8 Times are in milliseconds (ms). [------------------------ Downsampling: torch.Size([1, 3, 906, 438]) -> (120, 96) -------------------------] \| Reference, PIL 8.3.2, mode: RGB \| 1.10.0a0+git1e87d91 1 threads: ------------------------------------------------------------------------------------------------- channels_first contiguous torch.float32 \| 1.6 \| 1.8 channels_last non-contiguous torch.float32 \| 1.6 \| 1.9 Times are in milliseconds (ms). [----------------------- Downsampling: torch.Size([1, 3, 906, 438]) -> (1200, 196) ------------------------] \| Reference, PIL 8.3.2, mode: RGB \| 1.10.0a0+git1e87d91 1 threads: ------------------------------------------------------------------------------------------------- channels_first contiguous torch.float32 \| 9.0 \| 11.3 channels_last non-contiguous torch.float32 \| 8.9 \| 12.5 Times are in milliseconds (ms). [----------------------- Downsampling: torch.Size([1, 3, 906, 438]) -> (120, 1200) ------------------------] \| Reference, PIL 8.3.2, mode: RGB \| 1.10.0a0+git1e87d91 1 threads: ------------------------------------------------------------------------------------------------- channels_first contiguous torch.float32 \| 2.1 \| 1.8 channels_last non-contiguous torch.float32 \| 2.1 \| 3.4 Times are in milliseconds (ms). [--------------- Downsampling: torch.Size([1, 1, 906, 438]) -> (320, 196) --------------] \| Reference, PIL 8.3.2, mode: F \| 1.10.0a0+git1e87d91 1 threads: ------------------------------------------------------------------------------ contiguous torch.float32 \| 1.2 \| 1.0 Times are in milliseconds (ms). [--------------- Downsampling: torch.Size([1, 1, 906, 438]) -> (460, 220) --------------] \| Reference, PIL 8.3.2, mode: F \| 1.10.0a0+git1e87d91 1 threads: ------------------------------------------------------------------------------ contiguous torch.float32 \| 1.4 \| 1.3 Times are in milliseconds (ms). [--------------- Downsampling: torch.Size([1, 1, 906, 438]) -> (120, 96) ---------------] \| Reference, PIL 8.3.2, mode: F \| 1.10.0a0+git1e87d91 1 threads: ------------------------------------------------------------------------------ contiguous torch.float32 \| 719.9 \| 599.9 Times are in microseconds (us). [-------------- Downsampling: torch.Size([1, 1, 906, 438]) -> (1200, 196) --------------] \| Reference, PIL 8.3.2, mode: F \| 1.10.0a0+git1e87d91 1 threads: ------------------------------------------------------------------------------ contiguous torch.float32 \| 3.7 \| 3.5 Times are in milliseconds (ms). [-------------- Downsampling: torch.Size([1, 1, 906, 438]) -> (120, 1200) --------------] \| Reference, PIL 8.3.2, mode: F \| 1.10.0a0+git1e87d91 1 threads: ------------------------------------------------------------------------------ contiguous torch.float32 \| 834.4 \| 605.7 Times are in microseconds (us). ``` </details> Code is moved from torchvision: https://github.com/pytorch/vision/pull/4208 Pull Request resolved: https://github.com/pytorch/pytorch/pull/65142 Reviewed By: mrshenli Differential Revision: D32432405 Pulled By: jbschlosser fbshipit-source-id: b66c548347f257c522c36105868532e8bc1d4c6d	2021-11-17 09:10:15 -08:00
vfdev-5	6adbe044e3	Added nearest-exact interpolation mode (#64501 ) Summary: Added "nearest-exact" interpolation mode to fix the issues: https://github.com/pytorch/pytorch/issues/34808 and https://github.com/pytorch/pytorch/issues/62237. Description: As we can not fix "nearest" mode without large impact on already trained model [it was suggested](https://github.com/pytorch/pytorch/pull/64501#pullrequestreview-749771815) to introduce new mode instead of fixing exising "nearest" mode. - New mode "nearest-exact" performs index computation for nearest interpolation to match scikit-image, pillow, TF2 and while "nearest" mode still match opencv INTER_NEAREST, which appears to be buggy, see https://ppwwyyxx.com/blog/2021/Where-are-Pixels/#Libraries. "nearest": ``` input_index_f32 = output_index * scale input_index = floor(input_index_f32) ``` "nearest-exact" ``` input_index_f32 = (output_index + 0.5) * scale - 0.5 input_index = round(input_index_f32) ``` Comparisions with other libs: https://gist.github.com/vfdev-5/a5bd5b1477b1c82a87a0f9e25c727664 PyTorch version \| 1.9.0 "nearest" \| this PR "nearest" \| this PR "nearest-exact" ---\|---\|---\|--- Resize option: \| \| OpenCV INTER_NEAREST result mismatches \| 0 \| 0 \| 10 OpenCV INTER_NEAREST_EXACT result mismatches \| 9 \| 9 \| 9 Scikit-Image result mismatches \| 10 \| 10 \| 0 Pillow result mismatches \| 10 \| 10 \| 7 TensorFlow result mismatches \| 10 \| 10 \| 0 Rescale option: \| \| \| size mismatches (https://github.com/pytorch/pytorch/issues/62396) \| 10 \| 10 \| 10 OpenCV INTER_NEAREST result mismatches \| 3 \| 3\| 5 OpenCV INTER_NEAREST_EXACT result mismatches \| 3 \| 3\| 4 Scikit-Image result mismatches \| 4 \| 4 \| 0 Scipy result mismatches \| 4 \| 4 \| 0 TensorFlow: no such option \| - \| - Versions: ``` skimage: 0.19.0.dev0 opencv: 4.5.4-dev scipy: 1.7.2 Pillow: 8.4.0 TensorFlow: 2.7.0 ``` Implementations in other libs: - Pillow: - `ee079ae67e/src/libImaging/Geometry.c (L889-L899)` - `ee079ae67e/src/libImaging/Geometry.c (L11)` - `a[2] == 0` - Scikit-Image : - dev v0.19.0 uses scipy ndi.zoom: - `38fae50c3f/skimage/transform/_warps.py (L180-L188)` - `47bb6febaa/scipy/ndimage/src/ni_interpolation.c (L775-L779)` - `47bb6febaa/scipy/ndimage/src/ni_interpolation.c (L479)` Additionally: - Updated upsampling tests cc ezyang gchanan albanD mruberry jbschlosser walterddr fmassa heitorschueroff ppwwyyxx Pull Request resolved: https://github.com/pytorch/pytorch/pull/64501 Reviewed By: anjali411 Differential Revision: D32361901 Pulled By: jbschlosser fbshipit-source-id: df906f4d25a2b2180e1942ffbab2cc14600aeed2	2021-11-15 14:28:19 -08:00
yanbing-j	12026124cc	Avoid the view for mkldnn case in 1D convolution (#68166 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/68034 Pull Request resolved: https://github.com/pytorch/pytorch/pull/68166 Reviewed By: mrshenli Differential Revision: D32432444 Pulled By: jbschlosser fbshipit-source-id: fc4e626d497d9e4597628a18eb89b94518bb3b33	2021-11-15 11:56:45 -08:00
eqy	a1ace029e2	Add host-side memory requirement for `test_softmax_64bit_indexing` (#67922 ) Summary: https://github.com/pytorch/pytorch/issues/67910 The original `largeTensorTest` decorator didn't account for the additional host-side memory requirements. Thanks crcrpar for raising the issue, CC ptrblck Pull Request resolved: https://github.com/pytorch/pytorch/pull/67922 Reviewed By: malfet Differential Revision: D32308602 Pulled By: mruberry fbshipit-source-id: 97b7d2c39fe63c1a8269402f72186026a89f6b4c	2021-11-11 09:24:15 -08:00
Dani El-Ayyass	f171c78c04	add unpack_sequence and unpad_sequence functions (#66550 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/66549 Pull Request resolved: https://github.com/pytorch/pytorch/pull/66550 Reviewed By: malfet Differential Revision: D32299193 Pulled By: jbschlosser fbshipit-source-id: 96c92d73d3d40b7424778b2365e0c8bb1ae56cfb	2021-11-10 15:15:08 -08:00
Joel Schlosser	9a2db6f091	Factor backend routing logic out of convolution forward (#67790 ) Summary: This PR introduces a new function `_select_conv_backend` that returns a `ConvBackend` enum representing the selected backend for a given set of convolution inputs and params. The function and enum are exposed to python for testing purposes through `torch/csrc/Module.cpp` (please let me know if there's a better place to do this). A new set of tests validates that the correct backend is selected for several sets of inputs + params. Some backends aren't tested yet: * nnpack (for mobile) * xnnpack (for mobile) * winograd 3x3 (for mobile) Some flowcharts for reference: ![conv_routing_graph md](https://user-images.githubusercontent.com/75754324/140828957-1135b400-38c0-4c9f-87ef-4f33ceebeeae.png) ![conv_nogroup_routing_graph md](https://user-images.githubusercontent.com/75754324/140828977-ed223a4e-aa86-49f1-9925-c0f6b9ab36af.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/67790 Reviewed By: zou3519 Differential Revision: D32280878 Pulled By: jbschlosser fbshipit-source-id: 0ce55174f470f65c9b5345b9980cf12251f3abbb	2021-11-10 07:53:55 -08:00
Xiao Wang	f6a4c80a5a	Refactor cuDNN Convolution memory format and Conv-Bias-Relu code (#65594 ) Summary: This PR makes several changes: - Changed function `bool cudnn_conv_use_channels_last(...)` to `at::MemoryFormat cudnn_conv_suggest_memory_format(...)` - Removed `resize_` in cudnn convolution code. Added a new overloading method `TensorDescriptor::set` that also passes the desired memory format of the tensor. - Disabled the usage of double + channels_last on cuDNN Conv-Relu and Conv-Bias-Relu. Call `.contiguous(memory_format)` before passing data to cuDNN functions. - Disabled the usage of cuDNN fused Conv-Bias-Relu in cuDNN < 8.0 version due to a CUDNN_STATUS_NOT_SUPPORTED error. Instead, use the native fallback path. - Let Conv-Bias-Relu code respect the global `allow_tf32` flag. From cuDNN document, double + NHWC is genenrally not supported. Close https://github.com/pytorch/pytorch/pull/66968 Fix https://github.com/pytorch/pytorch/issues/55301 Pull Request resolved: https://github.com/pytorch/pytorch/pull/65594 Reviewed By: jbschlosser, malfet Differential Revision: D32175766 Pulled By: ngimel fbshipit-source-id: 7ba079c9f7c46fc56f8bfef05bad0854acf380d7	2021-11-05 11:50:55 -07:00
Alban Desmaison	bb8978f605	Revert D32175963: Converting hardswish to strucutred kernels with metatensor support Test Plan: revert-hammer Differential Revision: D32175963 (`57335a9ee3`) Original commit changeset: f4d749c6aeaf fbshipit-source-id: 6d68a96cf872c2d7b518c061875b9336bca0043a	2021-11-05 07:04:40 -07:00
John Clow	57335a9ee3	Converting hardswish to strucutred kernels with metatensor support (#66899 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66899 Test Plan: Imported from OSS Reviewed By: malfet Differential Revision: D32175963 Pulled By: Gamrix fbshipit-source-id: f4d749c6aeaf064084be72361607ea4f3f6bc91d	2021-11-04 19:02:00 -07:00
soulitzer	83e8612d11	Clean up test autograd (#67413 ) Summary: Partially fixes https://github.com/pytorch/pytorch/issues/66066 This PR: - cleans up op-specific testing from test_autograd. test_autograd should be reserved for testing generic autograd functionality - tests related to an operator are better colocated - see the tracker for details What to think about when moving tests to their correct test suite: - naming, make sure its not too generic - how the test is parametrized, sometimes we need to add/remove a device/dtype parameter - can this be merged with existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/67413 Reviewed By: jbschlosser, albanD Differential Revision: D32031480 Pulled By: soulitzer fbshipit-source-id: 8e13da1e58a38d5cecbfdfd4fe2b4fe6f816897f	2021-11-03 15:26:09 -07:00
Xiao Wang	31cf3d6aad	Fix adaptive_max_pool2d for channels-last on CUDA (#67697 ) Summary: Fix https://github.com/pytorch/pytorch/issues/67239 The CUDA kernels for `adaptive_max_pool2d` (forward and backward) were written for contiguous output. If outputs are non-contiguous, first create a contiguous copy and let the kernel write output to the contiguous memory space. Then copy the output from contiguous memory space to the original non-contiguous memory space. Pull Request resolved: https://github.com/pytorch/pytorch/pull/67697 Reviewed By: ejguan Differential Revision: D32112443 Pulled By: ngimel fbshipit-source-id: 0e3bf06d042200c651a79d13b75484526fde11fe	2021-11-03 09:47:29 -07:00
John Shen	234bd6dc56	[quantized] Add bilinear quantized grid_sample (#66879 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66879 This adds a quantized implementation for bilinear gridsample. Bicubic interpolation cannot be supported as easily since we rely on the linearity of quantization to operate on the raw values, i.e. f(q(a), q(b)) = q(f(a, b)) where f is the linear interpolation function. ghstack-source-id: 141321116 Test Plan: test_quantization Reviewed By: kimishpatel Differential Revision: D31656893 fbshipit-source-id: d0bc31da8ce93daf031a142decebf4a155943f0f	2021-11-01 14:44:26 -07:00
kshitij12345	885a8e53ba	replace onlyOnCPUAndCUDA with onlyNativeDeviceTypes (#65201 ) Summary: Reference https://github.com/pytorch/pytorch/issues/53849 Replace `onlyOnCPUandCUDA` with `onlyNativeDeviceTypes` which includes `cpu, cuda and meta`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/65201 Reviewed By: mrshenli Differential Revision: D31299718 Pulled By: mruberry fbshipit-source-id: 2d8356450c035d6a314209ab51b2c237583920fd	2021-11-01 09:22:34 -07:00
Joel Schlosser	16d937b0df	Fix strided _conv_double_backward() with 3D input / weight (#67283 ) Summary: Removes the 3D special case logic in `_convolution_double_backward()` that never worked. The logic was never called previously since `convolution()` expands input / weight from 3D -> 4D before passing them to backends; backend-specific backward calls thus save the 4D version to pass to `_convolution_double_backward()`. The new general `convolution_backward()` saves the original 3D input / weight, uncovering the bug. Pull Request resolved: https://github.com/pytorch/pytorch/pull/67283 Reviewed By: anjali411 Differential Revision: D32021100 Pulled By: jbschlosser fbshipit-source-id: 0916bcaa77ef49545848b344d6385b33bacf473d	2021-10-29 09:48:53 -07:00
Sameer Deshmukh	edd4d246c3	Accept 0-dim channel inputs in convolution layer (#66256 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/56998 . Pull Request resolved: https://github.com/pytorch/pytorch/pull/66256 Reviewed By: mrshenli Differential Revision: D31859428 Pulled By: jbschlosser fbshipit-source-id: 034b6c1ce35aac50eabfa09bbcd8b1e3c8b171bd	2021-10-25 08:12:29 -07:00
Eddie Yan	d9c4b3feab	Do rowwisemoments computation in `float` for `half` `LayerNorm` (#66920 ) Summary: https://github.com/pytorch/pytorch/issues/66707 Pull Request resolved: https://github.com/pytorch/pytorch/pull/66920 Reviewed By: mrshenli Differential Revision: D31850612 Pulled By: ngimel fbshipit-source-id: a95a33567285dcf9ee28d33f503cead3268960f9	2021-10-22 09:50:42 -07:00

1 2 3 4 5 ...

1134 Commits